Can Ridge Regression solve my problem?

2 visualizzazioni (ultimi 30 giorni)
Dear All,
I plan to buy Statistics and Machine Learning Toolbox to apply Ridge Regression to solve my problem. But I do not know if Ridget Regression can solve my problem or not.
My problem: x + a*y1 + b*y2 = 2. Where a = -b = 10000. The observations of y1 = 1.005 (true value is 1.0001) and y2 = 0.998 (the true value is 0.99999) with noise. I got x = -68.0. However, the true value of x is 0.9.
I am wondering if Ridge Regression can solve my problem or not. If it can, I will buy the ToolBox.
Thanks.
Benson
  2 Commenti
John D'Errico
John D'Errico il 18 Mag 2021
I don't think this is a valid reason to buy a toolbox, because you don't understand what ridge regression is or what it does. At the same time, it is very unclear what you are really doing. Yes, you do seem to have an ill-posed problem, but that does not mean ridge regression will solve your problem. I think you need to sit down with someone who understands numerical methods, and discuss your problem in depth.
Benson Gou
Benson Gou il 18 Mag 2021
Hi, John,
Thanks for your reply. I did discuss this problem with a professor of statistis from UCLA. But it seems that I still do not get a solution to my problem. I read a number of papers about ridge regression and I think I am not a fully layman.
I think we may be able to consider my problem as a extended problem X = [x; y1; y2] and the system matrix A = [1 a b; 0 1 0; 0 0 1], b = [2; 1.001; 0.998]. The linear equation becomes A*X = b.
My problem becomes:
min ||A*X - b||_2
S.T. A*X = b
I want to find a small adjustment of y1 and y2 so that x can be closer to 0.9, the true value.
I do not know if I am clear about my problem. Thanks a lot.
Benson

Accedi per commentare.

Risposta accettata

John D'Errico
John D'Errico il 19 Mag 2021
Modificato: John D'Errico il 19 Mag 2021
Don't buy a toolbox, if your only reason is to hope to use it to solve a problem where you don't fully understand how or why you would use that toolbox.
Would I buy the stats toolbox? OF COURSE! In a heartbeat. It is one of the toolboxes I do use frequently, and I would not be without it.
And I'm sorry, but I think you don't really understand why ridge regression exists, or what it is for, if you hope to use it to solve this problem. I think you are trying to squeeze your problem into a ridge regression context, hoping it will work for you.
Does that mean you cannot solve this problem using simple methods? NO. The problem actually becomes simple. The issue is to formulate it in a way that has a valid mathematical context.
You have two parameters y1 and y2, with measured values of [1.005 and 0.998]. a and b are fixed constants, with a=-b=10000.
Now you have the relationship
x + a*y1 + b*y2 = 2
Do you really KNOW that x == 0.9? If so, then the problem becomes a simple one. Find the minimal perturbations to y1 and y2, such that the expression
x + a*y1 + b*y2 = 2
holds true, where x = 0.9. That is, solve for the vector dy, such that the expression
x + a*(y(1) + dy(1)) + b*(y(2) + dy(2)) == 2
and norm(dy) is a minimum, with x == 0.9. How would I solve that problem? I'd use Lagrange multipliers. Time to write MATLAB code...
x = 0.9;
a = 10000;
b = -a;
y_obs = [1.005 0.998];
dy = sym('dy',[1 2]);
dy = 
assume(dy,'real')
syms lambda real % lagrange multiplier. They are like bunny rabbits. They just multiply all the time.
The objective function is simply written using a Lagrange multiplier. Note that I can formulate the problem in terms of norm or the square of the norm, and both will achieve the same final solution, but using the square of the vector norm makes the mathematics simpler.
obj = sum(dy.^2) + lambda*(x + a*(y_obs(1) + dy(1)) + b*(y_obs(2) + dy(2)) - 2)
obj = 
So we intend to solve the problem of minimizing the sum of squares of the perturbations, subject to the linear equality constraint as given. A classic problem for Lagrange multipliers. We differentiate the objective with respect to all three parameters, then solve for where the gradient is zero.
gradient(obj,[dy,lambda])
ans = 
sol = solve(gradient(obj,[dy,lambda]))
sol = struct with fields:
dy1: [1×1 sym] dy2: [1×1 sym] lambda: [1×1 sym]
sol.dy1
ans = 
sol.dy2
ans = 
We really don't care what lambda was here to achieve that goal, but if you want to know...
sol.lambda
ans = 
So the minimal perturbation of the vector y_obs, such that x is exactly 0.9, will be:
format long g
y_obs + double([sol.dy1,sol.dy2])
ans = 1×2
1.001555 1.001445
At the same time, I'm not sure that you really know the value of x. But your question was not that clear. Why have you formulated it as you did? It looks like you formulated it that way to shoehorn it into a ridge regression context.
Really, I never even needed the symbolic toolbox to solve this problem, since the solve command applied to a purely linear system. Pencil and paper would have done as easily. But the use of solve here made things simple and clean.
Now, do you not know x? Is your real problem where you just want to make x small? The issue there is how small is small? Until you clearly define what needs to be done, what you know and what you do not know, we cannot write mathematics to solve an unknown problem. As you can see, I've shown you how to solve the problem in a way that I can achieve any value of x that I desire, as a minimal perturbation to the vector y_obs.
I suppose we could also have solved the problem to find a minimal perturbation of all three parameters, x, y1, and y2, such that the linear equality holds true. But while you have told me that y1 and y2 were observations, you did NOT claim that x is an observation. So it makes no sense in my eyes to perturb x, certainly not on the same scale as y1 and y2. And that means that ridge regression has no value here.
  1 Commento
Benson Gou
Benson Gou il 21 Mag 2021
Deae John,
Many thanks to your gerat help and nice try. Your solving method makes a lot of sense if we know x = 0.9. But we do not know x=0.9, however, we know x is close to 1.0. After I red your comments two days ago, I think I found an approach to solve my problem.
My goal is to force x close to 1.0 other than -68.0. We can actually use Least Square to solve it:
min ||x - 1.0||_2 + ||A*X - b||_2 which becomes min ||A1*X1 - b1||_2 by combining these two items into one.
I used lsqr to solve it and got the values of x, y1 and y2 as follows:
x = 1.0; y1 = 0.9996; y2 = 0.9994.
The residual is: 0.0; 0.0; -0.0014; 0.0014.
This is whay I exactly wanted, i.e., sacrfes the accuracy of y1 and y2 to obtain a value of x which is closer to its true value.
Thanks a lot again.
Benson

Accedi per commentare.

Più risposte (0)

Tag

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by