Documentation

forecast

Class: arima

Forecast ARIMA or ARIMAX process

Syntax

[Y,YMSE] = forecast(Mdl,numPeriods)
[Y,YMSE,V] = forecast(Mdl,numPeriods)
[Y,YMSE,V] = forecast(Mdl,numPeriods,Name,Value)

Description

[Y,YMSE] = forecast(Mdl,numPeriods) forecasts responses for a univariate ARIMA model, and generates corresponding mean square errors, YMSE.

[Y,YMSE,V] = forecast(Mdl,numPeriods) additionally forecasts conditional variances for an ARIMA model with a conditional variance model.

[Y,YMSE,V] = forecast(Mdl,numPeriods,Name,Value) generates the forecasts with additional options specified by one or more Name,Value pair arguments.

Input Arguments

expand all

Mdl — ARIMA or ARIMAX modelarima model

ARIMA or ARIMAX model, specified as an arima model returned by arima or estimate.

The properties of Mdl cannot contain NaNs.

numPeriods — Forecast horizonpositive integer

Forecast horizon, specified as a positive integer.

The periods in the forecast horizon must be consistent with the periodicity of Mdl and the presample data.

Data Types: double

Name-Value Pair Arguments

Specify optional comma-separated pairs of Name,Value arguments. Name is the argument name and Value is the corresponding value. Name must appear inside single quotes (' '). You can specify several name and value pair arguments in any order as Name1,Value1,...,NameN,ValueN.

'E0' — Presample innovationsnumeric column vector | numeric matrix

Presample innovations that have mean 0 and provide initial values for the model, specified as the comma-separated pair consisting of 'E0' and a numeric column vector or numeric matrix. E0 must contain at least Mdl.Q rows. If you specify a conditional variance model, then E0 might require more than Mdl.Q rows. If E0 contains extra rows, then forecast only uses the latest presample innovations. The last row contains the latest presample innovation. If E0 is a column vector, then the software applies it to each forecasted path.

By default, if Y0 contains enough rows (at least Mdl.P + Mdl.Q), then forecast uses infer and the presample data to infer E0. For models with a regression component, if forecast infers E0, but X0 does not contain enough rows (at least the number of rows of Y0Mdl.P), then forecast displays an error. If the number of rows of Y0 is insufficient, then E0 is 0.

Data Types: double

'V0' — Presample conditional variancesnumeric column vector with positive entries | numeric matrix with positive entries

Presample conditional variances providing initial values for any conditional variance model, specified as the comma-separated pair consisting of 'V0' and a numeric column vector or matrix with positive entries. If the variance of the model is constant, then V0 is unnecessary. V0 is a column vector or a matrix with numPaths columns with enough rows to initialize the variance model. If V0 contains extra rows, then forecast only uses the latest conditional variances. The last row contains the latest conditional variance. If V0 is a column vector, then forecast applies it to each forecasted path.

By default, if E0 has sufficient length for the conditional variance model, then forecast infers the necessary presample conditional variances from the corresponding innovations E0. If E0 does not have sufficient length, then forecast sets V0 to the unconditional variance of the variance process.

Data Types: double

'X0' — Presample predictor datanumeric matrix

Presample predictor data that indicates the presence of a regression component in the conditional mean model, specified as the comma-separated pair of 'X0' and a numeric matrix. The columns of X0 are separate time series. X0 and XF must have the same number of columns. X0 must contain at least the number of rows of Y0Mdl.P. If X0 contains extra rows, then forecast only uses the latest observations. The last row indicates the latest observation of each series.

By default, forecast does not include a regression component in the conditional mean model regardless of the value of the regression coefficient Mdl.Beta.

Data Types: double

'XF' — Predictor forecastsnumeric matrix

Predictor forecasts, specified as the comma-separated pair of 'XF' and a numeric matrix. The columns of XF are separate time series. XF and X0 must have the same number of columns. XF must have at least numPeriods rows. Row i of XF contains the i period-ahead forecasts of X0. If XF exceeds numPeriods rows, then forecast only uses the first numPeriods forecasts. forecast treats XF as a fixed (nonstochastic) matrix.

By default, forecast does not include a regression component in the conditional mean model regardless of the value of the regression coefficient Mdl.Beta.

'Y0' — Presample responsesnumeric column vector | numeric matrix

Presample responses that provide initial values for the model, specified as the comma-separated pair consisting of 'Y0' and a numeric column vector or numeric matrix. Y0 must contain at least Mdl.P rows. If the number of rows exceeds Mdl.P, then forecast only uses the latest Mdl.P observations. The last row contains the latest observation. If Y0 is a column vector, then it is applied to each forecasted path.

By default, if the process is stationary and Mdl does not contain an regression component, then forecast sets the necessary presample observations to the unconditional mean of the process. Otherwise, Y0 is 0.

Data Types: double

    Notes  

    • If any of E0, V0, or Y0 contain numPaths > 1 columns, then each must have either numPaths columns or one column, otherwise an error occurs. For example, if Y0 has five columns, then E0 and V0 can either have five columns or one column. If E0 has one column, then it is applied to each path in Y0.

    • NaNs indicate missing values and forecast removes them. The software merges the presample data sets, then uses list-wise deletion to remove any NaNs. Removing NaNs in the data reduces the sample size, and can also create irregular time series.

    • forecast assumes that you synchronize presample data such that the latest observation of each presample series occurs simultaneously.

    • Set X0 to the same predictor matrix as X used in the estimation, simulation, or inference of Mdl. This assignment ensures correct computation of the innovations E0.

Output Arguments

expand all

Y — Minimum mean square error forecasts of response datanumeric matrix

Minimum mean square error (MMSE) forecasts of the conditional mean of the response data, returned as a numeric matrix. Y has numPeriods rows and numPaths columns.

forecast sets the number of columns of Y (numPaths) to the largest number of columns of the presample arrays Y0, E0, and V0. If you do not specify Y0, E0, or V0, then Y is a numPeriods column vector.

In all cases, row i contains the conditional mean forecasts for the ith period.

Data Types: double

YMSE — Mean square errors forecasts of conditional meannumeric matrix

Mean square errors (MSE) forecasts of the conditional mean Y, returned as a numeric matrix. YMSE has numPeriods rows and numPaths columns.

forecast sets the number of columns of YMSE (numPaths) to the largest number of columns of the presample arrays Y0, E0, and V0. If you do not specify Y0, E0, or V0, then Y is a numPeriods column vector.

In all cases, row i contains the forecast error variances for the ith period.

The square roots of YMSE are the standard errors of the forecasts of Y.

The predictor data does not contribute variability to YMSE because forecast treats XF as a nonstochastic matrix.

Data Types: double

V — Minimum mean square error forecasts of conditional variances of future model innovationsnumeric matrix

Minimum mean square error (MMSE) forecasts of the conditional variances of future model innovations, returned as a numeric matrix. V has numPeriods rows and numPaths columns.

forecast sets the number of columns of V (numPaths) to the largest number of columns of the presample arrays Y0, E0, and V0. If you do not specify Y0, E0, and V0, then V is a numPeriods column vector.

In all cases, row i contains the conditional variance forecasts for the ith period.

Data Types: double

Examples

expand all

Forecast the Conditional Mean Response

Forecast the conditional mean response of simulated data over a 30-period horizon.

Simulate 130 observations from a multiplicative seasonal MA model with known parameter values.

Mdl = arima('MA',{0.5,-0.3},'SMA',0.4,'SMALags',12,...
		'Constant',0.04,'Variance',0.2);
rng(200);
Y = simulate(Mdl,130);

Fit a seasonal MA model to the first 100 observations, and reserve the remaining 30 observations to evaluate forecast performance.

ToEstMdl = arima('MALags',1:2,'SMALags',12);
EstMdl = estimate(ToEstMdl,Y(1:100));
 
    ARIMA(0,0,2) Model with Seasonal MA(12):
    -----------------------------------------
    Conditional Probability Distribution: Gaussian

                                  Standard          t     
     Parameter       Value          Error       Statistic 
    -----------   -----------   ------------   -----------
     Constant        0.20403     0.0690637        2.95424
        MA{1}       0.502116     0.0972984        5.16058
        MA{2}       -0.20174      0.104466       -1.93115
      SMA{12}        0.27028      0.109071        2.47803
     Variance        0.18681     0.0327319        5.70728

EstMdl is a new arima model with parameters estimated.

Use the fitted model to forecast a 30-period horizon, and visually compare the forecasts to the holdout data.

[YF YMSE] = forecast(EstMdl,30,'Y0',Y(1:100));

figure
h1 = plot(Y,'Color',[.7,.7,.7]);
hold on
h2 = plot(101:130,YF,'b','LineWidth',2);
h3 = plot(101:130,YF + 1.96*sqrt(YMSE),'r:',...
		'LineWidth',2);
plot(101:130,YF - 1.96*sqrt(YMSE),'r:','LineWidth',2);
legend([h1 h2 h3],'Observed','Forecast',...
		'95% Confidence Interval','Location','NorthWest');
title(['30-Period Forecasts and Approximate 95% '...
			'Confidence Intervals'])
hold off

Forecast the NASDAQ Composite Index

Forecast the daily NASDAQ Composite Index over a 500-day horizon.

Load the NASDAQ data included with the toolbox, and extract the first 1500 observations.

load Data_EquityIdx
nasdaq = DataTable.NASDAQ(1:1500);

Fit an ARIMA(1,1,1) model to the data.

nasdaqModel = arima(1,1,1);
nasdaqFit = estimate(nasdaqModel,nasdaq);
 
    ARIMA(1,1,1) Model:
    --------------------
    Conditional Probability Distribution: Gaussian

                                  Standard          t     
     Parameter       Value          Error       Statistic 
    -----------   -----------   ------------   -----------
     Constant       0.430312      0.185554        2.31907
        AR{1}     -0.0743865      0.081985      -0.907318
        MA{1}       0.311253     0.0772657        4.02835
     Variance        27.8261      0.636248        43.7346

Forecast the Composite Index for 500 days using the fitted model. Use the observed data as presample data.

[Y,YMSE] = forecast(nasdaqFit,500,'Y0',nasdaq);

Plot the forecasts and 95% forecast intervals.

lower = Y - 1.96*sqrt(YMSE);
upper = Y + 1.96*sqrt(YMSE);

figure
plot(nasdaq,'Color',[.7,.7,.7]);
hold on
h1 = plot(1501:2000,lower,'r:','LineWidth',2);
plot(1501:2000,upper,'r:','LineWidth',2)
h2 = plot(1501:2000,Y,'k','LineWidth',2);
legend([h1 h2],'95% Interval','Forecast',...
	     'Location','NorthWest')
title('NASDAQ Composite Index Forecast')
hold off

The process is nonstationary, so the widths of the forecast intervals grow with time.

References

[1] Baillie, R., and T. Bollerslev. "Prediction in Dynamic Models with Time-Dependent Conditional Variances." Journal of Econometrics. Vol. 52, 1992, pp. 91–113.

[2] Bollerslev, T. "Generalized Autoregressive Conditional Heteroskedasticity." Journal of Econometrics. Vol. 31, 1996, pp. 307–327.

[3] Bollerslev, T. "A Conditionally Heteroskedastic Time Series Model for Speculative Prices and Rates of Return." The Review Economics and Statistics. Vol. 69, 1987, pp. 542–547.

[4] Box, G. E. P., G. M. Jenkins, and G. C. Reinsel. Time Series Analysis: Forecasting and Control 3rd ed. Englewood Cliffs, NJ: Prentice Hall, 1994.

[5] Enders, W. Applied Econometric Time Series. Hoboken, NJ: John Wiley & Sons, 1995.

[6] Engle, R. F. "Autoregressive Conditional Heteroskedasticity with Estimates of the Variance of United Kingdom Inflation." Econometrica. Vol. 50, 1982, pp. 987–1007.

[7] Hamilton, J. D. Time Series Analysis. Princeton, NJ: Princeton University Press, 1994.

Was this topic helpful?