Documentation

Unit Root Tests

Test Simulated Data for a Unit Root

This example shows how to test univariate time series models for stationarity. It shows how to simulate data from four types of models: trend stationary, difference stationary, stationary (AR(1)), and a heteroscedastic, random walk model. It also shows that the tests yield expected results.

Simulate four time series.

T = 1e3;       % Sample size
t = (1:T)';    % Time multiple

rng(142857);   % For reproducibility

y1 = randn(T,1) + .2*t; % Trend stationary

Mdl2 = arima('D',1,'Constant',0.2,'Variance',1);
y2 = simulate(Mdl2,T,'Y0',0); % Difference stationary

Mdl3 = arima('AR',0.99,'Constant',0.2,'Variance',1);
y3 = simulate(Mdl3,T,'Y0',0); % AR(1)

Mdl4 = arima('D',1,'Constant',0.2,'Variance',1);
sigma = (sin(t/200) + 1.5)/2; % Std deviation
e = randn(T,1).*sigma;        % Innovations
y4 = filter(Mdl4,e,'Y0',0);   % Heteroscedastic

Plot the first 100 points in each series.

y = [y1 y2 y3 y4];
figure;
plot1 = plot(y(1:100,:));
plot1(1).LineWidth = 2;
plot1(3).LineStyle = ':';
plot1(3).LineWidth = 2;
plot1(4).LineStyle = ':';
plot1(4).LineWidth = 2;
title '{\bf First 100 Periods of Each Series}';
legend('Trend Stationary','Difference Stationary','AR(1)',...
   'Heteroscedastic','location','northwest');

All of the models appear nonstationary and behave similarly. Therefore, you might find it difficult to distinguish which series comes from which model simply by looking at their initial segments.

Plot the entire data set.

plot2 = plot(y);
plot2(1).LineWidth = 2;
plot2(3).LineStyle = ':';
plot2(3).LineWidth = 2;
plot2(4).LineStyle = ':';
plot2(4).LineWidth = 2;
title '{\bf Each Entire Series}';
legend('Trend Stationary','Difference Stationary','AR(1)',...
   'Heteroscedastic','location','northwest');

The differences between the series are clearer here:

  • The trend stationary series has little deviation from its mean trend.

  • The difference stationary and heteroscedastic series have persistent deviations away from the trend line.

  • The AR(1) series exhibits long-run stationary behavior; the others grow linearly.

  • The difference stationary and heteroscedastic series appear similar. However, that the heteroscedastic series has much more local variability near period 300, and much less near period 900. The model variance is maximal when $\sin(t/200) = 1$, at time $100\pi \approx 314$. The model variance is minimal when $\sin(t/200) = -1$, at time $300\pi \approx 942$. Therefore, the visual variability matches the model.

Use the Augmented Dicky-Fuller test on the three growing series (y1, y2, and y4) to assess whether the series have a unit root. Since the series are growing, specify that there is a trend. In this case, the null hypothesis is $H_0: y_t = y_{t-1} + c + b_1\Delta y_{t-1} + b_2\Delta y_{t-2} + \varepsilon_t$ and the alternative hypothesis is $H_1: y_t = ay_{t-1} + c + \delta t + b_1\Delta y_{t-1} + b_2\Delta y_{t-2} + \varepsilon_t$. Set the number of lags to 2 for demonstration purposes.

hY1 = adftest(y1, 'model','ts', 'lags',2)
hY2 = adftest(y2, 'model','ts', 'lags',2)
hY4 = adftest(y4, 'model','ts', 'lags',2)
hY1 =

     1


hY2 =

     0


hY4 =

     0

  • hY1 = 1 indicates that there is sufficient evidence to auggest that y1 is trend stationary. This is the correct decision because y1 is trend stationary by construction.

  • hY2 = 0 indicates that there is not enough evidence to suggest that y2 is trend stationary. This is the correct decision since y2 is difference stationary by construction.

  • hY4 = 0 indicates that there is not enough evidence to suggest that y4 is trend stationary. This is the correct decision, however, the Dickey-Fuller test is not appropriate for a heteroscedastic series.

Use the Augmented Dickey-Fuller test on the AR(1) series (y3) to assess whether the series has a unit root. Since the series is not growing, specify that the series is autoregressive with a drift term. In this case, the null hypothesis is $H_0: y_t = y_{t-1} + b_1\Delta y_{t-1} + b_2\Delta y_{t-2} + \varepsilon_t$ and the alternative hypothesis is $H_1: y_t = ay_{t-1} + b_1\Delta y_{t-1} + b_2\Delta y_{t-2} + \varepsilon_t$. Set the number of lags to 2 for demonstration purposes.

hY3 = adftest(y3, 'model','ard', 'lags',2)
hY3 =

     1

hY3 = 1 indicates that there is enough evidence to suggest that y3 is a stationary, autoregressive process with a drift term. This is the correct decision because y3 is an autoregressive process with a drift term by construction.

Use the KPSS test to assess whether the series are unit root nonstationary. Specify that there is a trend in the growing series (y1, y2, and y4). The KPSS test assumes the following model:

$$ y_y = c_t + \delta t + u_t$$

$$c_t = c_{t-1} + \varepsilon_t,$$

where $u_t$ is a stationary process and $\varepsilon_t$ is an independent and identically distributed process with mean 0 and variance $\sigma^2$. Whether there is a trend in the model, the null hypothesis is $H_0: \sigma^2 = 0$ (the series is trend stationary) and the alternative hypothesis is $H_1: \sigma^2 > 0$ (not trend stationary). Set the number of lags to 2 for demonstration purposes.

hY1 = kpsstest(y1, 'lags',2, 'trend',true)
hY2 = kpsstest(y2, 'lags',2, 'trend',true)
hY3 = kpsstest(y3, 'lags',2)
hY4 = kpsstest(y4, 'lags',2, 'trend',true)
hY1 =

     0


hY2 =

     1


hY3 =

     1


hY4 =

     1

All is tests result in the correct decision.

Use the variance ratio test on al four series to assess whether the series are random walks. The null hypothesis is $H_0$: $Var(\Delta y_t)$ is constant, and the alternative hypothesis is $H_1$: $Var(\Delta y_t)$ is not constant. Specify that the innovations are independent and identically distributed for all but y1. Test y4 both ways.

hY1 = vratiotest(y1)
hY2 = vratiotest(y2,'IID',true)
hY3 = vratiotest(y3,'IID',true)
hY4NotIID = vratiotest(y4)
hY4IID = vratiotest(y4, 'IID',true)
hY1 =

     1


hY2 =

     0


hY3 =

     0


hY4NotIID =

     0


hY4IID =

     0

All tests result in the correct decisions, except for hY4_2 = 0. This test does not reject the hypothesis that the heteroscedastic process is an IID random walk. This inconsistency might be associated with the random seed.

Alternatively, you can assess stationarity using pptest

Test Time Series Data for a Unit Root

This example shows how to test a univariate time series for a unit root. It uses wages data (1900-1970) in the manufacturing sector. The series is in the Nelson-Plosser data set.

Load the Nelson-Plosser data. Extract the nominal wages data.

load Data_NelsonPlosser
wages = DataTable.WN;

Trim the NaN values from the series and the corresponding dates (this step is optional, since the test ignores NaN values).

wDates = dates(isfinite(wages));
wages = wages(isfinite(wages));

Plot the data to look for trends.

plot(wDates,wages)
title('Wages')

The plot suggests exponential growth.

Transform the data using the log function to linearize the series.

logWages = log(wages);
plot(wDates,logWages)
title('Log Wages')

The data appear to have a linear trend.

Test the hypothesis that the series is a unit root process with a trend (difference stationary), against the alternative that there is no unit root (trend stationary). Set 'lags',[7:2:11], as suggested in Kwiatkowski et al., 1992.

[h,pValue] = kpsstest(logWages,'lags',[7:2:11])
Warning: Test statistic #1 below tabulated critical values:
maximum p-value = 0.100 reported. 
Warning: Test statistic #2 below tabulated critical values:
maximum p-value = 0.100 reported. 
Warning: Test statistic #3 below tabulated critical values:
maximum p-value = 0.100 reported. 

h =

     0     0     0


pValue =

    0.1000    0.1000    0.1000

kpsstest fails to reject the hypothesis that the wages series is trend stationary. If the result would have been [1 1 1], the two inferences would provide consistent evidence of a unit root. It remains unclear whether the data has a unit root. This is a typical result of tests on many macroeconomic series.

The warnings that the test statistic "...is below tabulated critical values" does not indicate a problem. kpsstest has a limited set of calculated critical values. When it calculates a test statistic that is outside this range, the test reports the p-value at the appropriate endpoint. So, in this case, pValue reflects the closest tabulated value. When a test statistic lies inside the span of tabulated values, kpsstest linearly interpolates the p-value.

Test Stock Data for a Random Walk

This example shows how to assess whether a time series is a random walk. It uses market data for daily returns of stocks and cash (money market) from the period January 1, 2000 to November 7, 2005.

Load the data.

load CAPMuniverse

Extract two series to test. The first column of data is the daily return of a technology stock. The last (14th) column is the daily return for cash (the daily money market rate).

tech1 = Data(:,1);
money = Data(:,14);

The returns are the logs of the ratios of values at the end of a day over the values at the beginning of the day.

Convert the data to prices (values) instead of returns. vratiotest takes prices as inputs, as opposed to returns.

tech1 = cumsum(tech1);
money = cumsum(money);

Plot the data to see whether they appear to be stationary.

subplot(2,1,1)
plot(Dates,tech1);
title('Log(relative stock value)')
datetick('x')
hold on
subplot(2,1,2);
plot(Dates,money)
title('Log(accumulated cash)')
datetick('x')
hold off

Cash has a small variability, and appears to have long-term trends. The stock series has a good deal of variability, and no definite trend, though it appears to increase towards the end.

Test whether the stock series matches a random walk.

[h,pValue,stat,cValue,ratio] = vratiotest(tech1)
h =

     0


pValue =

    0.1646


stat =

   -1.3899


cValue =

    1.9600


ratio =

    0.9436

vratiotest does not reject the hypothesis that a random walk is a reasonable model for the stock series.

Test whether an i.i.d. random walk is a reasonable model for the stock series.

[h,pValue,stat,cValue,ratio] = vratiotest(tech1,'IID',true)
h =

     1


pValue =

    0.0304


stat =

   -2.1642


cValue =

    1.9600


ratio =

    0.9436

vratiotest rejects the hypothesis that an i.i.d. random walk is a reasonable model for the tech1 stock series at the 5% level. Thus, vratiotest indicates that the most appropriate model of the tech1 series is a heteroscedastic random walk.

Test whether the cash series matches a random walk.

[h,pValue,stat,cValue,ratio] = vratiotest(money)
h =

     1


pValue =

  4.6093e-145


stat =

   25.6466


cValue =

    1.9600


ratio =

    2.0006

vratiotest emphatically rejects the hypothesis that a random walk is a reasonable model for the cash series (pValue = 4.6093e-145). The removal of a trend from the series does not affect the resulting statistics.

References

[1] Kwiatkowski, D., P. C. B. Phillips, P. Schmidt and Y. Shin. "Testing the Null Hypothesis of Stationarity against the Alternative of a Unit Root." Journal of Econometrics. Vol. 54, 1992, pp. 159–178.

See Also

| | |

More About

Was this topic helpful?