Contents

 

Time Series

 

Example:

Data Size: Different versions of XLMiner™  have varying limits on size of data. The size of data depicted in the example below may not be supported by your version. Refer to Data Handling Specifications for details.

While handling a time series situation, we can observe the following steps in general. 

  1. Partition the data : Partitioning the data into 60% Training and 40% Validation generally helps to go ahead. 

  2. We apply the exploratory techniques like ACF and PACF and view the plots for Training and Validation. If they are in synch then it is worth going ahead for model fitting. If these plots are same then we can apply the same model to both the training and validation sets. The plots also show if there is a trend and / or seasonality in the data. Thus ACF and PACF plots help us to tentatively identify the model. 

  3. We use ARIMA technique to fit a model.

  4. When we fit a model, XLMiner™ displays the ACF and PACF plots for residuals. If these plots are in the band of UCL and LCL then it indicates that the residuals are random and the model is adequate. 

  5. If the residuals are not in the band, then some correlation exists between them,  so we may need to improve the model.

Open the data set Income.xls. Let us do a step by step model fitting and see the outputs. This data set contains an yearly record of the statewise personal incomes. Let us do partitioning as shown below.

  1. Invoke XLMiner --> Time Series --> Partition Data . Make the following settings --

  2. In the dialog shown above we have selected the options for specifying the #records. It also shows the %ge -- 70% and 30%.The numbers we have selected here are just for getting, sort of  "near ideal" output in the given situation. That will enable us see the various qualities and properties of the techniques we are going to use. See the output below.

    See that the partitioning is sequential here, not random. First 50 observations go to Training set and the rest to Validation set. 

  3. Let's see what inputs we get on using the exploratory techniques.

    Invoke XLMiner --> Time Series --> ACF (Autocorrelations).. Select CA as the variable. Enter the lags as shown. Check for plotting charts.

  4. Click OK for getting output. See the ACF plots for Training and Validation set. We observe the following :

     

    • Both the charts exhibit a similar pattern. This means we can fit the same model to both the  sets.

    • The ACF function shows a definite pattern, it decreases with the lags. This means there is a trend in the data. Since the pattern does not repeat , we can conclude that the data does not show any seasonality.

  5. Invoke Time Series --> PACF (Partial Autocorrelations). Do the following settings in PACF dialog.

  6. Select OK and see the output. The PACF plots also show the similar pattern in training and validation sets.

    • Both the charts exhibit a similar pattern. This means we can fit the same model to both the  sets.

    • The PACF function shows a definite pattern, it decreases with the lags. This means there is a trend in the data. Since the pattern does not repeat , we can conclude that the data does not show any seasonality.

    • From ACF and PACF we can conclude that ARIMA (1,1,0) will be the best candidate.

  7. Let us now try fitting ARIMA (1,1,0). Invoke Time Series --> ARIMA. Enter the following parameters.

  8. Select Advanced .. and enter the following settings.

    Select OK.

    Select OK for the first ARIMA dialog. See the output.

  9. XLMiner™ calculates and displays various parameters and charts in two output sheets. Let us see and interprete a few outputs. 

    XLMiner™ has calculated the constant term and the AR1 term for our model, as seen above. These are the constant and f1 terms of our forecasting equation. See the following output of the Chi - square test. 

    We can conclude that the fit is good since the p-value is insignificant.

    XLMiner™ provides some more information in addition to these parameters.

  10. Open the sheet ARIMA_Residuals1. The table presents the actual and fitted values and the resulting residuals. See the Time plot of Actual Vs Forecast

    Here we see that the Actual and Forecasted values greatly match. The usefulness of the model in forecasting will depend upon how close the actual and forecasted values are in the Time plot of validation set which we will see later.

    Let us take a look at the ACF and PACF plots for residuals. 

    Most of the correlations are within the UCL and LCL band. This indicates that the residuals are random, they are not correlated. This is one indication in turn, that ARIMA (1,1,0) is adequate for this data.

  11. Open the sheet ARIMA_Output1. See the Forecast table.

    The table shows the actual and forecasted value. The "Lower" and "upper" values represent a band around the forecast with 95% confidence interval in which the actual value may lie.

    Let us take a look at the Time plot below. It is plotted with the values in the table above. It indicates how the model which we fitted using Training data works on validation data.

    The actual and forecasted values are fairly close. This is a confirmation that our model is good for forecasting. If we plot the values under the "Lower" and "Upper" column in the same chart (using the EXCEL™ chart facility) we get the following.

    The plot shows that the Actual values lie well inside the band created by the upper and lower values in the table, almost always at the center of the band. So we have fitted an adequate model.

     

Seasonal Data : 

 What if the data show seasonality? Consider the data set Airpass.xls. It consists monthly records of the no. of passengers in an airline. The data are bound to be seasonal. The airtraffic increases in the vacation months. It is moderate in other seasons. Let us try to fit seasonal model using this data set.

  1. Select XLMiner --> Time Series --> Partition Data.

    Here we select the options to specify #records. XLMiner™ displays the %ge in training and validation set just besides the #records. 

  2. Select OK and see the output. As we did in the earlier example, let us find the ACF and PACF functions. Enter the settings shown below.

     

  3. Select OK. See the outputs.

  4. The plots clearly show a repetition in the pattern indicating that the data are seasonal, there is periodicity after every 12 observations, ie they show seasonality and trend in the data. Let us view the PACF charts.

     

     

    The PACF plots also show seasonality, trend .The plots for Training set and Validation set are similar. We can try applying the same seasonal model to Training and Validation set.
  5. We will try to fit a seasonal model to this data. Let us try applying ARIMA (1, 1, 0) (1, 1, 0)12. This means that  we are applying a seasonal model with period =12. Selection of the value of period depends on nature of data. In this case we can make a fair guess that the # passengers will be more in holidays, ie every 12 months. So we select period = 12.

  6.  

    Select Advanced...

  7. Enter the following parameters in Advanced box.

  8. Select OK.

    Select OK for the first ARIMA dialog. See the output.

  9. XLMiner™ calculates and displays various parameters and charts in two output sheets. Let us see and interpret a few outputs. 

    XLMiner™ has calculated the constant term and the AR1 term for our model, as seen above. These are the constant and f1 terms of our forecasting equation. See the following output of the Chi - square test. 

  10. Open the sheet ARIMA_Residuals1.

    The actual and forecasted values match very well in the time plot above.

  11. Most of the residuals are within the UCL and LCL band. This indicates that the residuals are random, they are not correlated. This is one indication that the fit is quite good.
  12. Open the sheet ARIMA_Output1. See the Forecast table.

    The table shows the actual and forecasted value. The "Lower" and "upper" values represent a band around the forecast with 95% confidence interval in which the actual value may lie.

    Let us take a look at the Time plot below. It is plotted with the values in the table above. It indicates how the model which we fitted using Training data works on validation data.

    The actual and forecasted values are fairly close, though not as well as in the earlier example. Still, we can conclude that this model fits well. If we plot the values under the "Lower" and "Upper" column in the same chart we get the following.

    The plot shows that the Actual values lie well inside the band created by the upper and lower values in the table, almost always at the center of the band. So we have fitted an adequate model.

 

Smoothing : 

Smoothing techniques are useful because they are non parametric. By fine tuning the values of a, b and g we can see the output easily. We will view an example in which we will use the Holt Winters' smoothing.

Let's use the same Airpass.xls. Use the same method of partitioning as in the example above.

Invoke Time Series --> Smoothing --> Holt Winter --> Multiplicative.. Let us try some tried and tested values for a, b and g  so that we can see a good fit. These values are arrived at by trial and error. You can try some more values later. 

Enter the following specifications. 

Select OK. We receive a sheet with various parameters, forecasts and error measures. See the forecast.

The errors are considerably less than the earlier example. See the Time plot of Actual Vs Forecast.

The Actual and Forecasted values are matching very well, better than in our ARIMA example! This is possibly the best fit.

The error measures also give us an idea how good the fit is. See the following error measures : 

See also