why is my uqlab PCE output constant? (For pre-existing discrete data)

4 visualizzazioni (ultimi 30 giorni)
I am trying to create a PCE model from pre-existing discrete data. I have one input variable (Xtrain) and one output variable (Ytrain), both discrete. I have created a Kernel Density Estimation to make it continuous for the input:
KDE_Marginals = uq_KernelMarginals(Xtrain);
InputOpts.Marginals = KDE_Marginals;
myInput = uq_createInput(InputOpts);
Then I gave the discrete inputs into ExpDesign:
MetaOpts_OLS.ExpDesign.X = Xtrain;
MetaOpts_OLS.ExpDesign.Y = Ytrain;
Here are the parameters I specified for the PCE:
MetaOpts_OLS.Type = 'Metamodel';
MetaOpts_OLS.MetaType = 'PCE';
MetaOpts_OLS.Method = 'OLS';
MetaOpts_OLS.Degree = 1:30;
I then evaluated it from seperate testing data that I had split from the original X:
YOLS = uq_evalModel(myPCE_OLS, Xtest);
For visualization, this is the constant output I'm seeing:
And comparing the STD and MEAN of the PCE and Actual:
%PCE output:
disp(std(uq_evalModel(myPCE_OLS, Xtest))); %result STD: 63
disp(mean(uq_evalModel(myPCE_OLS, Xtest))); %result MEAN: 333
%Actual Data:
disp(std(Ytest)); %result STD: 312
disp(mean(Ytest)); %result MEAN: 331
Clearly, the PCE model is giving a pretty constant output from my data. My Xtrain varies from (0,30), and my Ytrain varies from (0,1438), so the model shouldn't be constant. Any ideas on why I'm getting a PCE constant output? I've tried tweaking the Degree, and have tried all other Coefficient Calculation Techniques. I've also checked the KDE is a good estimate, and it's pdf graph matches the Xtrain pdf histogram very well.

Risposte (2)

Aashray
Aashray il 11 Giu 2025
Modificato: Aashray il 11 Giu 2025
Hello @Avery,
It appears that the issue is due to the discrete nature of the input variable (Xtrain), which doesn’t align with PCE's assumption of continuous input distributions. This can result in the model underfitting and producing near-constant predictions.
To resolve this, you can use Kernel Density Estimation (KDE) to transform the discrete input into a continuous distribution.
Please refer to the example code, that shows how to apply KDE, set up the PCE model correctly, and evaluate it.
uqlab;
%% Step 1: Loading the data
X = randi([0 30], 1000, 1); % Simulated discrete input
Y = 10 * X + randn(1000,1)*100; % Simulated output
% Splitting into training and testing sets
Xtrain = X(1:800);
Ytrain = Y(1:800);
Xtest = X(801:end);
Ytest = Y(801:end);
%% Step 2: Converting discrete input to continuous using KDE
KDE_Marginals = uq_KernelMarginals(Xtrain);
InputOpts.Marginals = KDE_Marginals;
myInput = uq_createInput(InputOpts);
%% Step 3: Defining PCE metamodel
MetaOpts.Type = 'Metamodel';
MetaOpts.MetaType = 'PCE';
MetaOpts.Method = 'OLS';
MetaOpts.Input = myInput;
MetaOpts.ExpDesign.Sampling = 'user';
MetaOpts.ExpDesign.X = Xtrain;
MetaOpts.ExpDesign.Y = Ytrain;
MetaOpts.Degree = 1:20;
MetaOpts.TruncOptions.qNorm = 0.75;
%% Step 4: Creating and evaluate the PCE model
myPCE = uq_createModel(MetaOpts);
Ypred = uq_evalModel(myPCE, Xtest);
%% Step 5: Comparing results
fprintf('PCE STD: %.2f | Ytest STD: %.2f\n', std(Ypred), std(Ytest));
fprintf('PCE Mean: %.2f | Ytest Mean: %.2f\n', mean(Ypred), mean(Ytest));
%% Step 6: Visualizing predictions
figure;
scatter(Xtest, Ytest, 'b.'); hold on;
scatter(Xtest, Ypred, 'r.');
legend('Actual', 'PCE Prediction');
xlabel('X'); ylabel('Y');
title('Discrete Input → KDE → PCE Model Performance');
The output predictions look like this:
In this, the PCE model is now learning meaningful patterns in the data and providing predictions that reflect the true variability of the output.
You may refer to the attached documentation for the UQLab functions used: https://www.uqlab.com/user-manuals

Avery
Avery il 11 Giu 2025
I have used a KDE estimate as my Input Object already, and it is a good estimate, as shown below compared the the histogram(x):
I have tried changing my output to mean(Y) instead, to have one output for each input. This has helped slightly, as shown by the PCE validation methods below:
However, my LOO error is still very high. Here are some parameters I've recorded, when forcing the degree as:
MetaOpts.Degree = 10:30;
OLS LARS OMP SP BCS
_____ _______ _______ _______ _______
Degree 12 10 10 30 12
CoefficientLength 13 11 11 31 13
LOO 208.3 0.83201 0.83201 0.81914 0.98952
qNorm 1 0.75 0.75 0.75 0.75
I have 2 questions:
1) The validation graphs allign better when I make the training set 20% and the testing set 80%. Should I be doing this instead, or should the training set always be larger than the testing set?
2) What parameters should I be changing or looking at to lower the LOO for each PCE method? Is there anything you can think of?
Here is the code I have for building one of my PCE models:
%Create Input Object (KDE)
KDE_Marginals = uq_KernelMarginals(Xtrain);
InputOpts.Marginals = KDE_Marginals;
myInput = uq_createInput(InputOpts);
% Define PCE models
% 1) OLS
MetaOpts_OLS.Type = 'Metamodel';
MetaOpts_OLS.MetaType = 'PCE';
MetaOpts_OLS.Method = 'OLS';
MetaOpts_OLS.Input = myInput;
MetaOpts_OLS.Degree = 10:30;
MetaOpts_OLS.ExpDesign.Sampling = 'user';
MetaOpts_OLS.ExpDesign.X = Xtrain;
MetaOpts_OLS.ExpDesign.Y = Ytrain;
myPCE_OLS = uq_createModel(MetaOpts_OLS);

Tag

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by