Derive full formula from fitlm(X,y,"quadratic")

15 visualizzazioni (ultimi 30 giorni)
I have X which is a 258x6 double and y which is a 258x1 double. These are used to fit a model utilizing fitlm. I found the quadratic to yield the best results.
X = outTableNumeric{:, corr_ids}; % Using the most correlated features
y = outTableNumeric.target; % Target variabel
% Fit a linear regression model
mdl = fitlm(X, y, "quadratic");
However when I try to derive a formula to be deployed outside matlab I find it extremely obscure how I'm to combine the quadratic terms with the coefficients and so on?
I tried to make the formula programmatically but that is not working when I test the result on some row of X.
coefficients = mdl.Coefficients.Estimate;
% Extract the simple notation of the formula
intercept = coefficients(1);
formula = sprintf('%.2f', intercept);
for i = 2:length(coefficients)
if coefficients(i) ~= 0
feature_indices = find(strcmp(mdl.CoefficientNames, mdl.CoefficientNames{i}));
feature_name = mdl.CoefficientNames{feature_indices(1)};
formula = sprintf('%s + %.2f * %s', formula, coefficients(i), feature_name);
end
end
strrep(formula,':','*');
disp('Simple notation of the formula:');
disp(formula);
Can you please help me extract the full equation?

Risposta accettata

John D'Errico
John D'Errico il 23 Lug 2024
Modificato: John D'Errico il 23 Lug 2024
X = rand(258,6);
y = rand(258,1);
mdl = fitlm(X, y, "quadratic")
mdl =
Linear regression model: y ~ 1 + x1*x2 + x1*x3 + x1*x4 + x1*x5 + x1*x6 + x2*x3 + x2*x4 + x2*x5 + x2*x6 + x3*x4 + x3*x5 + x3*x6 + x4*x5 + x4*x6 + x5*x6 + x1^2 + x2^2 + x3^2 + x4^2 + x5^2 + x6^2 Estimated Coefficients: Estimate SE tStat pValue _________ _______ ________ ________ (Intercept) 0.57576 0.29322 1.9636 0.050783 x1 -0.37307 0.39362 -0.9478 0.34423 x2 0.1415 0.39079 0.36209 0.71762 x3 0.2982 0.39912 0.74715 0.45573 x4 -0.15672 0.41253 -0.3799 0.70437 x5 -0.2706 0.40666 -0.66542 0.50645 x6 -0.18681 0.39256 -0.47587 0.63462 x1:x2 0.11095 0.22534 0.49235 0.62294 x1:x3 0.11595 0.22555 0.51408 0.60769 x1:x4 -0.21297 0.23823 -0.89399 0.37226 x1:x5 0.018241 0.22852 0.079822 0.93645 x1:x6 0.14906 0.224 0.66547 0.50642 x2:x3 -0.078433 0.22451 -0.34935 0.72714 x2:x4 -0.10602 0.24031 -0.44116 0.65951 x2:x5 -0.054747 0.23808 -0.22996 0.81833 x2:x6 0.20974 0.24169 0.8678 0.38641 x3:x4 -0.29942 0.22792 -1.3137 0.19026 x3:x5 -0.03888 0.21461 -0.18116 0.8564 x3:x6 -0.080697 0.25933 -0.31117 0.75595 x4:x5 0.33241 0.24641 1.349 0.17866 x4:x6 0.030806 0.24789 0.12427 0.90121 x5:x6 -0.10127 0.24546 -0.41257 0.68031 x1^2 0.23249 0.26395 0.88083 0.37933 x2^2 -0.17792 0.25219 -0.70551 0.48121 x3^2 0.013372 0.25903 0.051623 0.95887 x4^2 0.22872 0.2511 0.91086 0.36332 x5^2 0.098667 0.25524 0.38657 0.69943 x6^2 0.18421 0.25429 0.72438 0.46957 Number of observations: 258, Error degrees of freedom: 230 Root Mean Squared Error: 0.299 R-squared: 0.078, Adjusted R-Squared: -0.0303 F-statistic vs. constant model: 0.72, p-value = 0.844
There are 6 variables, here called x1...x6, corresponding to the 6 columns of your array. And 28 terms in the model. You can see them listed above, as represented by the linear regression model.
If you are not sure how to use the result of fitlm, here are the tools that can be applied:
methods(mdl)
Methods for class LinearModel: addTerms compact gather plotAdjustedResponse plotPartialDependence random anova disp partialDependence plotDiagnostics plotResiduals removeTerms coefCI dwtest plot plotEffects plotSlice step coefTest feval plotAdded plotInteraction predict
And of course, you can perform a prediction at any point as:
predict(mdl,X(1,:))
ans = 0.4734
If you do extract the coefficients themselves from the model, do NOT extract them only as 5 digit approximations. Do NOT just use the numbers you see under the extimate column. (That is perhaps the most common mistake we see made in MATLAB, thinking that because we see a number written as 0.12097, that is the actual number. WRONG. The actual number is a double precision number. Use it properly.) So we can extract the coefficients as the vector:
coeffs = mdl.Coefficients.Estimate;
format long g
coeffs(1) % the constant term...
ans =
0.57575799544073
  13 Commenti
Jacob Emil Elbek
Jacob Emil Elbek il 24 Lug 2024
Thanks everyone - I got what I needed.
John D'Errico
John D'Errico il 25 Lug 2024
Modificato: John D'Errico il 25 Lug 2024
After some discussion with tech support, I learned a resolution, though I did not find it myself. The documentation for LinearModel needs to be enhanced IMHO.
X=rand(50,3); y=rand(50,1);
mdl=fitlm(X,y,'quadratic')
mdl =
Linear regression model: y ~ 1 + x1*x2 + x1*x3 + x2*x3 + x1^2 + x2^2 + x3^2 Estimated Coefficients: Estimate SE tStat pValue __________ _______ __________ _________ (Intercept) 0.92378 0.27333 3.3798 0.0016297 x1 -0.68435 0.69208 -0.98883 0.32869 x2 -0.69012 0.78414 -0.88011 0.38406 x3 -0.39798 0.74644 -0.53317 0.59687 x1:x2 0.62579 0.55378 1.13 0.2652 x1:x3 0.40651 0.58199 0.69848 0.48892 x2:x3 -0.35752 0.68687 -0.5205 0.60559 x1^2 -0.0047517 0.57365 -0.0082833 0.99343 x2^2 0.62704 0.73182 0.85682 0.39665 x3^2 0.44112 0.68853 0.64067 0.52539 Number of observations: 50, Error degrees of freedom: 40 Root Mean Squared Error: 0.303 R-squared: 0.126, Adjusted R-Squared: -0.0707 F-statistic vs. constant model: 0.64, p-value = 0.756
exponents = mdl.Formula.Terms(:,1:end-1)
exponents = 10x3
0 0 0 1 0 0 0 1 0 0 0 1 1 1 0 1 0 1 0 1 1 2 0 0 0 2 0 0 0 2
<mw-icon class=""></mw-icon>
<mw-icon class=""></mw-icon>
Note that I needed to drop the last column. But this is essentially a full description of each term in the model. It identifies the exponents of each variable that would be necessary to reconstruct the model. How can we use this? First of all, we should see it does work properly.
syms x1 x2 x3
vpa(mdl.Coefficients.Estimate.'*prod([x1,x2,x3].^exponents,2),4)
ans = 
For a single point newX, as a row vector, we can just do this:
mdlfun = @(newX) mdl.Coefficients.Estimate.'*prod(newX.^exponents,2);
mdlfun([1 2 3])
ans = 4.4647
And for comparison, we see:
predict(mdl,[1 2 3])
ans = 4.4647
With just a little more effort, we could even vectorize the function.

Accedi per commentare.

Più risposte (0)

Tag

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by