Standardisation (zero-mean, unit-variance)

Hi there, I am working with the neural network toolbox in matlab. My problem is not with the toolbox but with the data preparation before it is used and my resulting output. I standardise the input and targte output on which I train the network but this means that my estimated output, when I simulate the network on a new set of values has also a zero-mean, unit-vairance. I do not want this because while the estimated output correctly follows the trend and amplitude changes of my target output I want the physical values not these standardised values. Is there a way to 'de-standardise' my output i.e. add on the mean and multiply by the standard deviation. The mean of my time series is not constant so I am sure this is not as straightforward as I have said.
Thank you for any help
Sarah

 Risposta accettata

Greg Heath
Greg Heath il 30 Giu 2012
> I standardise the input and targte output on which I train the network but this means that my estimated >output, when I simulate the network on a new >set of values has also a zero-mean, unit-vairance.
In the best of all worlds!
Then you can use the mean and variance of the original output data to convert the new output to the correct location and scale.
A basic assumption of NN regression and classification models is that both design (train + val) and nondesign (test) data can be assumed to come from the same probability distribution.
If you expect that assumption might invalid for the new input data, you can always compare the summary statistics (e.g., mean, variance, correlations ...) of the new input data with that of the original input (design + test) data.
You can also compare the outputs when the new input data is normalized with it's own mean and variance vs using the mean and variance of the original data.
Are you using newfit(~newff) or fitnet(~feedforwardnet)?
Are your original standari"z"ations done with mapstd before creating the net or are you using net.input{i}.processFcn, (i = 1,2)?
>I do not want this because while the estimated output correctly follows the trend and amplitude >changes of my target output I want the physical values not these standardised values. Is there a way to >'de-standardise' my output i.e. add on the mean and multiply by the standard deviation. The mean of >my time series is not constant so I am sure this is not as straightforward as I have said.
This readily done using the 'reverse' option of mapstd. I don't remember if it is done automatically or you have to do it explicitly. See the documentation and examples of mapstd (or mapminmax,... the same principles apply).
Hope this helps.
Greg

13 Commenti

What are the sizes of your data matrices and number of hidden nodes?
How different are MSEtrn, MSEval and MSEtst?
Correction:
If you expect that assumption might invalid for the new input data, you can always compare the summary statistics (e.g., mean, variance, correlations ...) of the new input data with that of the original input (train+val) data.
Hi Greg, Thank you for your response. I know this may be a silly question to ask but what is the purpose of standardizing the data before training? When I work with non-zero mean non-unit variance data the network sppears to work just as well with regards to fit etc.. All of my inputs and targets have non-zero mean so would you be able to explain the reason behind this standardization in the first place? Thanks
The recent versions of newff, feedforwardnet etc automatically use mapminmax [-1,1] to train. However the scaled data is not available to the user and may not even be the best scaling to use.
Before I even think about creating and training the neural model, I always calulate summary statistics of input and output before and after standardization. I also obtain plots and the results of constant and linear models for reference.
This provides me a better "feel" for the data; especially plots and the detection of outliers.
As a result, the data gets scaled twice. Once by me and once by the training function. I could turn the automatic scaling off, but since outliers were previous deleted that won't improve performance.
Hope this helps.
Greg
Hi Greg, I can see the need for the standardization if I want to look at the individual rankings of the inputs. But if I want to give my network a matrix of 5 inputs, would standardizing each input not destroy any amplitude relationships between inputs? Would this matter in the training of the network? Would it be possible to remove the mean and standard deviation of the entire input matrix at all or would this be neccessary? I am looking at an 5x2688 input matrix and a 1x2688 output with 20 hidden nodes and I am using fitnet. Sarah
I did not use mapstd to standardize the input data matrix. I standardized each input vector individually by removing the mean and dividing by the standard deviation of the input. Is this incorrect? MSEtrn, MSEval and MSEtst are all very similar to one another (from 0.266 to 0.308).
There is no connection between standardization and input ranking.
You lose no information via standardization because you learn, via training, the weights that will optimize performance.
You can lose quite a bit of information by not tranforming inputs. If inputs are too large they can saturate the sigmoids, therefore requiring more sigmoids for an adequate approximation and causing increased training time. This is especially deleterious if the inputs are too asymmetric about zero.
You can also significantly increase training time if inputs are very small. This may cause the need for huge weights and result in loss of accuracy.
Each row is tranformed separately because each variable may be on a different scale (e.g, millimeters and kiometers). Since the inputs to the hidden node activation fuctions are linear combinations of inputs, it makes sense for the input scales to be similar.
It is a waste of time to transform each vector separately.
meanx = repmat(mean(x,2),1,N);
sigx = repmat(std(x,0,2),1,N);
xstd = (x-meanx)./sigx;
Thank you for your reply Greg, that really helps. I am now using the mapstd to for standardization before I train the network. When I 'apply' mapstd to a new data set the result does not have zero mean unit variance. My understanding was that when you applied mapstd on a new set of data it would transform the data to be normalised? I have looked at the mean and variance of my training set and the mean and variance of my test set and they are much higher for my test data. I am looking at ocean wave heights and obviously these will change drastically throughout the year so would I have to take a yearly mean and variance to train the network before I can then apply mapstd to a new test data set? Thanks for your help
A basic assumption is that the important summary statistics of the design data adequately characterizes the important summary statistics of the nondesign data.
The amount they differ will affect the accuracy, precision and confidence of your result.
Hi Greg, Sorry to ask again about this subject but there is something I am rather unsure about. If the network has an inbuilt mapstd function on the input and output layers why dows it matter if I first standardize the data myself. Surely if the network is standardizing the data anyway the performance results should be the same which is not the case for my data. I find that if I set the processing function to mapstd with pre-standardized data my resulting MSE is three times smaller than if I train the network using non-standardized data. Thank you for your help Sarah
Not a fair comparison ( assuming you use the same RNG seed).
To compare, use a NORMALIZED MSE by dividing by the MSE of a constant model. NMSE = MSE/MSE00 where MSE00 = mean(var(t,1,2).
For details search the Newsgroup and Answers with
heath MSE00
Hope this helps.
Greg
Thank you for that answer. Re: __A basic assumption is that the important summary statistics of the design data adequately characterizes the important summary statistics of the nondesign data.
The amount they differ will affect the accuracy, precision and confidence of your result.
For my network I look at both winter and summer time data. If I train my network on a full year, meaning that my training summary statistics are for a whole year (an average of the seasons) is it therefore correct to test the network independantly on just winter time data or summer time data. Spring and Autumn statistics are quite similar to one another. Or would it be more correct to train the network for the different seasons?
Thank you for all of your help
There is no absolute correct. Since seasonal means, spreads and correlations are never exactly equal, you have to determine how much difference will end in a result that you can live width

Accedi per commentare.

Più risposte (0)

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by