Bug on polyfit output?

Question

0 voti

Hi, I am wondering why the results change when I call polyfit with the tilde ('~'), in order to obviously surpress the remaining outputs:

>> p = polyfit([1 2 3 5 10], [5 65 84 2 3],1)
p =
     -4.7402   51.7087

BUT

>> [p,~,~] = polyfit([1 2 3 5 10], [5 65 84 2 3],1)
p =
    -16.8925   31.8000

I thought in both cases p should contain the same coefficients. Does anybody know why there is a difference? The ~ method works fine with other functions for example like size:

>> [p,~] = size([11 11; 11 11;11 11])
p =
       3

I use R2016a. Looking forward to your answers. Kind regards!

0 Commenti
Mostra -2 commenti meno recenti Nascondi -2 commenti meno recenti

Accedi per commentare.

Accedi per rispondere a questa domanda.

Follow Question

Answer 1

dpb il 8 Giu 2018

Modificato: dpb il 8 Giu 2018

Apri in MATLAB Online

2 voti

"Feature" or "Quality of Implementation" depending on your viewpoint...

help polyfit
  ...
  [p,S,mu] = polyfit(x,y,n) also returns mu, which is a two-element vector with centering
  and scaling values. 
  mu(1) is mean(x), and mu(2) is std(x). Using these values, polyfit centers x at zero
  and scales it to have unit standard deviation 
 ...

I'd never tried it before with the tilde as the third output argument so wasn't aware it (the tilde, that is) was being counted as if the argument were there, but clearly it is.

>> x=[1 2 3 5 10];
>> [mean(x) std(x)]
ans =
  4.2000    3.5637
>> [p,~,mu] = polyfit([1 2 3 5 10], [5 65 84 2 3],1)
p =
-16.8925   31.8000
mu =
  4.2000
  3.5637
>>

It comes from ancient history of how polyfit was initially implemented; truthfully to have the output variable determine whether the independent variable is/is not scaled is/was a less-than-optimal design and almost certainly wouldn't have made the cut under today's ideas of software design/interface. But, 30 years ago or so when first implemented ideas were far different than are today.

ADDENDUM: However, what's the purpose of using the tilde for trailing return value position holders that you don't want, anyway? Any number of output variables beyond those provided for are automagically dropped; the only purpose/need for the tilde is to not return one (or more) arguments that are positioned prior to one that is desired.

Of course, here's a case because of the unusual input design that the output is dependent upon the number of inputs that if you want the scaling you have to provide the output argument.

I've found it somewhat surprising that TMW hasn't introduced a more capable and modern version into base product rather than restricting only to the toolboxes (which I find somewhat cumbersome albeit more flexible).

19 Commenti
Mostra 17 commenti meno recenti Nascondi 17 commenti meno recenti

John D'Errico il 9 Giu 2018

Modificato: John D'Errico il 9 Giu 2018

But I would argue it very much changes the way the code behaves on an existing case. And this is something TMW strives mightily to avoid.

Currently, polyfit has one behavior for the single return argument call, and another for the triple return. If you chose not to use the second and third returned arguments, that is not a problem, just your call, and your lack of need for those other arguments. Maybe you already know those arguments, so have no need to recompute them. And there are surely people out there who use it like that. This is the existing (so the expected) behavior is to return the three arg form, whenever three args are returned, and it has done so for multiple years now.

But it seems it would be strongly inconsistent if polyfit decided to use the one argument form if it somehow knew that even though you called it with the three arg form, you were not going to use two of those arguments. To me, this is just pleading for people to send in bug reports. It will be difficult to explain in the documentation why one form is used over another. Highly confusing to users ... "yes, we return one form or another, depending on many variables, along with a call to rand on alternate Tuesdays." Well, yes, that is a bit strong. :)

Remember that, if you change the behavior of supplied code between release, there will still be many people using an older release. So even if someone is not now using polyfit in any specific way in an older release, there may well be someone who will make that choice in the future.

In the end, you can feel free to want TMW to change polyfit as you wish. But I would predict that will never happen, not without an edict from way above, and a very good rationale for that action.

dpb il 9 Giu 2018

Modificato: dpb il 9 Giu 2018

You misunderstand what I'm suggesting, John.

I'm saying the enhancement should be to add a feature that allows the coder to determine if the positional argument is a tilde or not and determine what to do on that basis, NOT to arbitrarily ignore it as if it weren't there.

The implementation (*) would NOT be to have the behavior of nargout to change to not report tilde in its count; it would still behave just as it does.

While I think (and have always thought from the first time I discovered it lo! those many years ago) that the interface to polyfit is a very poor choice for how to indicate to standardize the independent variable, as noted I'd never suggest breaking it or to introduce other compatibility issues.

With the existence of the new feature, there would be no change to polyfit at all; it is the aberrant case of the third positional output argument changes definition of the first so there's nothing that can be gained computationally, it must compute the statistics from which to standardize. Other functions that may have auxiliary secondary outputs that aren't needed and may be expensive to compute could make use of the new feature if desired but that would have no bearing on existing code prior to the feature being introduced.

I'd love to see TMW introduce a better yet simple tool similar to poly[fit|val] into the base product, but doesn't seem likely to happen. Of course, this suggestion is unlikely in the extreme, too, probably... :) There are any number of other warts I'd put higher up on the wish list but this does seem potentially useful facility.

(*) One possible (not necessarily best and certainly not only) implementation would be to add a second optional input argument (while pretty rarely used I would imagine, the present one is a function handle|character vector to query output definition for a function from its definition) that would be a flag to return an array by position of logical that indicated the corresponding output positional argument as specified|tilde; the first output variable would still be the count, the second array would be of length(count).

Called with zero or one argument a function handle or string, the result would still be the same as present; with one argument not a function handle or function string or with two the other information would be returned as well.

dpb il 9 Giu 2018

I don't have any to add otomh; probably there are some others.

That the list is short is indicative it's not a mainstream coding paradigm (fortunately :) ).

SIZE() of course is at least discernible as to what one gets and why; none of those should be a major surprise.

The ODE solvers have part of the same heritage that their implementation came "way back when" before there was any such idea as OOP implemented and to create a struct for the deval function and to "package" the solution was about the only implementation choice at the time it was introduced (when, precisely, struct came into being I don't know exactly, somewhere between 1992 (printed doc doesn't include a release number) and Release 5 (~1999); in R5 ODEmn doesn't yet have the SOL output even though struct does exist so it's of some vintage but not totally ancient but was, at the time, the only real option.

I'd assert that polyfit which goes even farther back, just is a poor choice for how to have designed the interface because as John notes, it's a model difference based on an output variable specification rather than an input and rather than just either a rearrangement or suppression of the same data as in SIZE() or presentation of results for ODEmn which is a fundamentally different animal and could have been done differently with existing facilities.

But, again, while it was unfortunate choice, history is history and it shouldn't be broken (I think it should be deprecated and replaced, myself, but it's again not one of those items that's significant enough so as to be worth it in the larger scheme of what should be worked on).

$0.02, imo, ymmv, etc., etc., etc., ..., of course :)

DrZoidberg il 10 Giu 2018

Apri in MATLAB Online

Thanks @ dpb and the others for your detailed explanation. I was not aware this 'special feature' is a result from Matlab programming history a few decades ago.

Just some comments:

- I used the tilde for function calls that have more than one output, just to remember it has more outputs than the requested ones. Maybe this is just a (bad) habit. In my optinion this somehow improves the readability of the code. When someone else (usually occasionally non-expert Matlab-user just like me) works with my code, tilde shows him that the function would offer more than the requested output.

- I was aware that there are functions whose outputs depend on how they are called. Like for 'size' or others mentioned in the comments above. However, up to now I did not encounter a function that chagnes the output values like polyfit(), depending on nargout. I excpected the regression coefficients in p should always be the same. My fault was to compare p from sole polyfit call

p1 = polyfit(x,y,1)

towards p from tilde polyfit call (in this case p refers to altered x data).

[p2,~,~] = polyfit(x,y,1)

When you call polyval with p2 it is necessary to feed it with these mu-scaled x data:

% x,y just some fantasy data
[~,~,mu]=polyfit(x,y,1);
plot(x,y,'ok',x,polyval(p1,x),'-r',x,polyval(p2,(x-mu(1))/mu(2)),'--b')

I hope I summed up correctly and this helps someone else someday. Have a nice day!

Jan il 10 Giu 2018

The ~ was introduced in R2009b.

Guillaume il 10 Giu 2018

I used the tilde for function calls that have more than one output, just to remember it has more outputs than the requested ones

For some functions this is not a good idea and will slow down your code. If you don't request the output at all, some functions will not go through the process of calculating the extra outputs. By using ~ you force the function to calculate the output, which you then discard.

Accedi per commentare.

Bug on polyfit output?

0 Commenti
Mostra -2 commenti meno recenti Nascondi -2 commenti meno recenti

Risposta accettata

19 Commenti
Mostra 17 commenti meno recenti Nascondi 17 commenti meno recenti

Più risposte (0)

Categorie

Tag

Community Treasure Hunt

Bug on polyfit output?

0 Commenti Mostra -2 commenti meno recenti Nascondi -2 commenti meno recenti

Risposta accettata

19 Commenti Mostra 17 commenti meno recenti Nascondi 17 commenti meno recenti

Più risposte (0)

Categorie

Tag

Vedere anche

Community Treasure Hunt

0 Commenti
Mostra -2 commenti meno recenti Nascondi -2 commenti meno recenti

19 Commenti
Mostra 17 commenti meno recenti Nascondi 17 commenti meno recenti