How to increase the accuracy of Logistic regression?

4 visualizzazioni (ultimi 30 giorni)
I want to do binary Logistic regression but the AUC curve is around 56 percent. How can I increase the accuracy of AUC to over 80 percent? I use fitglm in matlab.
  1 Commento
John D'Errico
John D'Errico il 9 Dic 2021
Merely insisting that you want to get a better fit will not suffice. If the model does not fit your data, then you can want it to fit as well as you like. Perhaps prayer is a good idea? It can't hurt. Of course, it won't help either. :-)
So, is your data just too noisy, but the model is valid? Is it a problem of lack of fit, so an invalid model? Is it merely a problem that you are performing the fit improperly? Can we really know from your vague statements without seeing the data, and without seeing what you tried to do? No.

Accedi per commentare.

Risposte (1)

Rupesh
Rupesh il 21 Feb 2024
Modificato: Rupesh il 29 Feb 2024
Hi Saeed Sohrabi,
I understand that you're facing a challenge with your binary logistic regression model in MATLAB, where the AUC is currently around 56%. You're aiming to enhance the model's performance to achieve an AUC of over 80%. To address this, you can explore several strategies:
Data Quality and Relevance
  • Clean your data: Make sure to remove any errors, outliers, or missing values that may be affecting your model's accuracy.
  • Feature relevance: Confirm that the predictors in your model are strongly related to the outcome variable. Eliminate any that aren't contributing to the model's predictive power.
Feature Engineering:
  • Create new features: Experiment with deriving new predictors from your existing data that might better capture the underlying patterns.
  • Feature selection: Utilize techniques such as forward selection, backward elimination, or regularization (like L1/L2 penalties) to refine the set of features used in your model.
Model Enhancement:
  • Polynomial terms: If your data shows a non-linear relationship, adding polynomial or interaction terms might improve model fit.
  • Regularization: Implement regularization methods like ridge (L2) or lasso (L1) to reduce overfitting and enhance generalization.
Data Resampling:
  • Balance your dataset: If you have an imbalanced dataset, consider resampling techniques to balance the classes, which may improve model performance.
  • Cross-validation: Employ k-fold cross-validation to ensure your model is robust and not overfitting to the training data.
Hyperparameter optimization: Tweak the hyperparameters in the “fitglm” function, such as the “link” function or the “distribution” family, to find a better performing model.
Model Evaluation:
  • Confusion matrix: Study the confusion matrix to identify and understand the errors your model is making.
  • Threshold adjustment: Fine-tune the classification threshold to balance sensitivity and specificity, which can influence the AUC.
Explore other models: If logistic regression is not yielding the desired results, consider trying other classification algorithms like Random Forest, Gradient Boosting Machines, or SVMs.
Please Note, Improving a model is an iterative process, and it may require multiple rounds of refinement. Sometimes, the limitations in the data itself might cap the AUC you can achieve. If that's the case, acquiring more or different data could be the key to better performance.
You can also refer to below documents on how one can increase the accuracy and model fit with respect to different parameters associated with data.
Hope this helps!

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by