Niels Bohr, Danish Physicist

August 29, 2017

Niels Bohr, Danish Physicist

The time-series plot is the most frequently used form of graphic design. With one dimension marching along to the regular rhythm of seconds, minutes, hours, days, weeks, months, years, or millennia, the natural ordering of the time scale gives this design a strength and efficiency of interpretation found in no other graphic arrangement.

Tufte (1983, p. 28)

**Source:**Tufte (1983, p. 28)

Tenth century time series plot – inclinations of the planetary orbits

Introduce FRED and highlight a few series, in particular:

Demonstrate:

- Federal Reserve Economic Data ("FRED")
- Thank you, St. Louis Fed!

Optional: Install the Fred Add-in for MS Excel

Download and install R

Download and install RStudio

Christoffer Koch and Julieta Yung (2017) **Dallas Fed Economic Letter** Vol. 12, No. 8 *Impact of Macroeconomic Announcements Changed After the Zero Lower Bound*

**Forecasting**…

… is about predicting the future as accurately as possible, given all of the information available, including historical data and knowledge of any future events that might impact the forecasts.

**Goals**…

… are what you would like to have happen. Goals should be linked to forecasts and plans, but this does not always occur. Too often, goals are set without any plan for how to achieve them, and no forecasts for whether they are realistic.

**Planning**…

… is a response to forecasts and goals. Planning involves determining the appropriate actions that are required to make your forecasts match your goals.

Problem definition.

Gathering information.

Preliminary (exploratory) analysis.

Choosing and fitting models.

Using and evaluating a forecasting model.

**Univariate Statistics**

sample mean

median

interquartile range

standard deviation

**Bivariate Statistics**

- correlation coefficient

**Mean**

\[\bar{x} = \frac{1}{N}\sum_{i=1}^N x_{i} = (x_{1} + x_{2} + x_3 + \cdots + x_{N})/N \]

**Standard Deviation**

\[s = \sqrt{\frac{1}{N-1} \sum_{i=1}^N (x_{i} - \bar{x})^2}. \]

Cars example - some data on 2009 model cars, each of which has an automatic transmission, four cylinders and an engine size under 2 liters.

subset(fuel, Litres<2)[, c(1,3,5,6,8)]

## Model Litres City Highway Carbon ## 20 Chevrolet Aveo 1.6 25 34 6.6 ## 21 Chevrolet Aveo 5 1.6 25 34 6.6 ## 19 Honda Civic 1.8 25 36 6.3 ## 2 Honda Civic Hybrid 1.3 40 45 4.4 ## 11 Honda Fit 1.5 27 33 6.1 ## 9 Honda Fit 1.5 28 35 5.9 ## 13 Hyundai Accent 1.6 26 35 6.3 ## 14 Kia Rio 1.6 26 35 6.1 ## 12 Nissan Versa 1.8 27 33 6.3 ## 31 Nissan Versa 1.8 24 32 6.8 ## 22 Pontiac G3 Wave 1.6 25 34 6.6 ## 23 Pontiac G3 Wave 5 1.6 25 34 6.6 ## 18 Pontiac Vibe 1.8 26 31 6.6 ## 33 Saturn Astra 2DR Hatchback 1.8 24 30 6.8 ## 34 Saturn Astra 4DR Hatchback 1.8 24 30 6.8 ## 17 Scion xD 1.8 26 32 6.6 ## 10 Toyota Corolla 1.8 27 35 6.1 ## 26 Toyota Matrix 1.8 25 31 6.6 ## 1 Toyota Prius 1.5 48 45 4.0 ## 8 Toyota Yaris 1.5 29 35 5.9

In this example, \(N=20\) and \(x_i\) denotes the carbon footprint of vehicle \(i\). Then the average carbon footprint is

\[ \begin{align} \bar{x} & = \frac{1}{20}\sum_{i=1}^{20} x_{i} \\ &= (x_{1} + x_{2} + x_3 + \dots + x_{20})/20 \\ &= (4.0 + 4.4 + 5.9 + \dots + 6.8 + 6.8 + 6.8)/20 \\ &= 124/20 = 6.2 \text{ tons CO}_{2}. \end{align} \]

The median, on the other hand, is the middle observation when the data are placed in order. In this case, there are 20 observations and so the median is the average of the 10th and 11th largest observations. That is

\[\text{median} = (6.3+6.6)/2 = 6.45.\]

Cars example - consider only the carbon footprint (the 8th variable)Interquartile range - simply the difference between the 75th and 25th percentiles

\[ \text{IQR} = (6.6 - 6.1) = 0.5. \]

fuel2 <- fuel[fuel$Litres<2,] summary(fuel2[,"Carbon"])

## Min. 1st Qu. Median Mean 3rd Qu. Max. ## 4.00 6.10 6.45 6.20 6.60 6.80

sd(fuel2[,"Carbon"])

## [1] 0.7440996

Correlation Coefficient

\[ r = \frac{\sum (x_{i} - \bar{x})(y_{i}-\bar{y})}{\sqrt{\sum(x_{i}-\bar{x})^2}\sqrt{\sum(y_{i}-\bar{y})^2}},\]

cor(fuel2[,"Carbon"], fuel2[,"City"])

## [1] -0.9688341

Autocorrelation

\[ r_{k} = \frac{\sum\limits_{t=k+1}^T (y_{t}-\bar{y})(y_{t-k}-\bar{y})}{\sum\limits_{t=1}^T (y_{t}-\bar{y})^2} \]

beer2 <- window(ausbeer, start=1992, end=2006-.1) lag.plot(beer2, lags=9, do.lines=FALSE)

Acf(beer2, las = 1, lwd = 2, main = "Autocorrelation Function")