- copulacdf - https://www.mathworks.com/help/stats/copulacdf.html
- fminbnd - https://www.mathworks.com/help/matlab/ref/fminbnd.html
Apply copulas for estimating a single missing marginal, is it possible?
1 visualizzazione (ultimi 30 giorni)
Mostra commenti meno recenti
Let's consider this example from matlab documentation (with little changes):
load stockreturns
x = stocks(:,1);
y = stocks(:,2);
z = stocks(:,3);
u = ksdensity(x,x,'function','cdf');
v = ksdensity(y,y,'function','cdf');
w = ksdensity(z,z,'function','cdf');
[Rho,nu] = copulafit('t',[u v w],'Method','ApproximateML')
Now, assume that Rho and nu are known. Let's consider (only for simplicity):
v(50)
And
y(50)
And assume that y has a missing observation:
v(50) = NaN;
y(50) = NaN;
How can I estimate the missing marginal v(50) and accordingly the missing observation y(50) knowing Rho, nu, x, y, z and u, v, w? In other terms: how can I impute the value of a missing observations knowing other marginals?
Thank you in advance for your help.
0 Commenti
Risposte (1)
Paras Gupta
il 17 Dic 2023
Hi Barbab,
I understand that you want to impute the value of a missing observation knowing other marginals.
To provide an estimate of the missing values, we can use the conditional distribution of the t-copula given the known marginals. The following code illustrates one way to achieve the same.
load stockreturns
x = stocks(:,1);
y = stocks(:,2);
z = stocks(:,3);
u = ksdensity(x,x,'function','cdf');
v = ksdensity(y,y,'function','cdf');
w = ksdensity(z,z,'function','cdf');
[Rho,nu] = copulafit('t',[u v w],'Method','ApproximateML');
% Assuming Rho, nu, x, y, z, u, v, w are known and v(50) and y(50) are missing
% Set the missing values to NaN
v(50) = NaN;
y(50) = NaN;
% Find indices of the non-missing data
nonMissingIdx = ~isnan(y);
% Estimate the CDF values for the non-missing y data
v_nonMissing = ksdensity(y(nonMissingIdx), y(nonMissingIdx), 'function', 'cdf');
% Fit the t-copula to the non-missing data
[Rho_nonMissing, nu_nonMissing] = copulafit('t', [u(nonMissingIdx) v_nonMissing w(nonMissingIdx)], 'Method', 'ApproximateML');
% For the missing observation, use the known values of x and z
known_x = x(50);
known_z = z(50);
% Calculate the CDF values of the known x and z
u_known = ksdensity(x, known_x, 'function', 'cdf');
w_known = ksdensity(z, known_z, 'function', 'cdf');
% Calculate the conditional distribution of y given x and z using the fitted t-copula
conditionalCdf = @(v) copulacdf('t', [u_known v w_known], Rho_nonMissing, nu_nonMissing);
% Find the quantile function (inverse CDF) for the non-missing y data
inv_v_nonMissing = @(p) ksdensity(y(nonMissingIdx), p, 'function', 'icdf');
% Use fminbnd to find the v value that makes the conditional CDF equal to 0.5
% This is a median estimate under the conditional distribution
v_estimate = fminbnd(@(v) abs(conditionalCdf(v) - 0.5), 0, 1);
% Convert the v_estimate to the corresponding y value using the inverse CDF
y_estimate = inv_v_nonMissing(v_estimate);
Please note that this is a simplified approach and assumes that the median of the conditional distribution is a reasonable estimate for the missing value. In practice, you may want to use more sophisticated imputation methods or consider the uncertainty in the estimate by sampling from the conditional distribution multiple times
You can refer to the documentation links below for more information on the code above.
Hope this helps.
1 Commento
Vedere anche
Categorie
Scopri di più su Probability Distributions in Help Center e File Exchange
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!