Bad incrementa​lRegressio​nLinear prediction results using simple slope 1x data

4 visualizzazioni (ultimi 30 giorni)
Hello All,
I am a new user of the Statistics and Machine Learning Toolbox and want to test the IncrementalRegressionLinear blocks with simple synthetic data with a slope of 1:
To simulate streaming data (e.g. from sensors) I preprocessed the data as follows:
rng(0,"twister") % For reproducibility
% Extract time and value columns
time = data.time;
value = data.value;
% Split the data into training and test set
n = length(time);
split_idx = round(0.7 * n);
data_train = data(1:split_idx,:)
% Training set
time_train = time(1:split_idx);
value_train = value(1:split_idx);
% Test set
time_test = time(split_idx + 1:end);
value_test = value(split_idx + 1:end);
n = numel(time_train);
p = size(value_train,2); % Number of predictors
numObsPerChunk = 11;
nchunk = floor(n/numObsPerChunk);
for j = 1:nchunk
ibegin = min(n,numObsPerChunk*(j-1) + 1);
iend = min(n,numObsPerChunk*j);
idx = ibegin:iend;
Xin(:,:,j) = value_train(idx,:);
Yin(:,j) = time_train(idx);
Xtestset(1,:,j) = 140;
ytrue(1,:,j) = 361;
end
k = size(Xin,3); % Number of data chunks
t = 0:k-1;
X_ts = timeseries(Xin,t,InterpretSingleRowDataAs3D=true);
Y_ts = timeseries(Yin',t,InterpretSingleRowDataAs3D=true);
Xtest_ts = timeseries(Xtestset,t,InterpretSingleRowDataAs3D=true);
ytest_ts = timeseries(ytestset,t,InterpretSingleRowDataAs3D=true);
ytrue_ts = timeseries(ytrue,t,InterpretSingleRowDataAs3D=true);
% incrementalRegressionLinear
Mdl = incrementalRegressionLinear(NumPredictors=p, Learner='svm', Solver='scale-invariant', Shuffle=false ,...
Standardize=true,EstimationPeriod=110, MetricsWarmupPeriod=11, MetricsWindowSize=11);
linearMdl = Mdl;
This is the 'connected' simulink model:
And this the plotted result:
For a value of 140 the true y would be 361.
So i was wondering why the prediction is so bad? As the data is 100% linear with a slope of 1 would assume that the prediction is also 361. But it varies around 300-375.
Does anyone knows why? Do i have a complete missunderstanding or did i forget something?
Best regards
Christoph

Risposte (1)

Prathamesh
Prathamesh il 29 Lug 2025
I understand that you have a table that contains time and value. And you want to test the ‘IncrememntalRegressionLinearl’ block.
Below are the required changes that might solve the issue:
  1. The plot of ‘data.time’ vs ‘data.value’ clearly shows a negative linear relationship. As ‘time’ increases the ‘value decreases’. Specifically the slope is ‘-1’.
  2. In the Simulink model , you connected ‘value_train’ to the ‘x’ input of the ‘IncrementalRegressionLinear Fit’ block and ‘time_train’ to ‘y’ input. This means the model is learning to predict time based on value, not value based on time.
Hope this helps.

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by