Purpose: To learn how to apply simple forecast evaluation metrics such as Mean Absolute Error (MAE), Root Mean Square Error (RMSE). To apply the simple forecasting techniques naive forecast, seasonally naive forecast, and mean forecast to a forecasting problem and evaluate their performance. This exercise is due Tuesday, September 12.

Recall that the forecast error is given by \(e_{i}=y_{i}-\hat{y}_{i}\). During our in-class forecasting exercise all of you produced forecasts \(\hat{y}_i\) of the unemployment rate (from the Household Survey) and the additions to nonfarm payrolls (from the Establishment Survey) that was released in the August 2017 Employment Report produced by the Bureau of Labor Statistics (BLS) on September 1, 2017. The actual unemployment rate for August 2017 turned out to be 4.4 percent. Here’s a list of your forecasts \(\{\hat{y}_i\}_{i=1,\cdots,24}\)for the unemployment rate. Each forecast is indexed by \(i\):

Index Unemployment Forecast \(\hat{y}_i\) Nonfarm Payroll Forecast \(\hat{y}_i\)
1 3.0 160,000
2 5.7 208,000
3 5.2 190,000
4 4.8 215,000
5 4.4 202,000
6 4.1 200,000
7 4.3 224,000
8 4.5 147,000
9 4.4 195,000
10 4.3 148,000
11 4.2 146,830
12 4.2 150,000
13 4.0 147,000
14 5.3 200,000
15 4.2 220,000
16 4.4 180,000
17 4.2 237,500
18 4.2 176,000
19 4.4 250,000
20 4.8 146,850
21 4.7 190,000
22 4.3 182,000
23 4.3 220,000
24 4.1 220,000

Implement all the questions in R. In addition to submitting your homework via CANVAS, submit individual R files to my SMU e-mail address prior to the deadline.

Preliminaries

rm(list =ls())
forecasts.unrate <- c(3.0,5.7,5.2,4.8,4.4,4.1,4.3,4.5,4.4,4.3,4.2,4.2,4.0,5.3,4.2,4.4,4.2,4.2,4.4,4.8,4.7,4.3,4.3,4.1)
forecasts.nonfar <- c(160000,208000,190000,215000,202000,200000,224000,147000,195000,148000,146830,150000,147000,200000,
                      220000,180000,237500,176000,250000,146850,190000,182000,220000,220000)
number.forecasts <- length(forecasts.unrate)
realization.unrate <- rep(4.4,n=number.forecasts)
realization.nonfar <- rep(156000,n=number.forecasts)

df.forecasts <- data.frame(1:number.forecasts,
                           forecasts.unrate,realization.unrate, 
                           forecasts.nonfar,realization.nonfar)

head(df.forecasts)
##   X1.number.forecasts forecasts.unrate realization.unrate forecasts.nonfar
## 1                   1              3.0                4.4           160000
## 2                   2              5.7                4.4           208000
## 3                   3              5.2                4.4           190000
## 4                   4              4.8                4.4           215000
## 5                   5              4.4                4.4           202000
## 6                   6              4.1                4.4           200000
##   realization.nonfar
## 1             156000
## 2             156000
## 3             156000
## 4             156000
## 5             156000
## 6             156000
tail(df.forecasts)
##    X1.number.forecasts forecasts.unrate realization.unrate
## 19                  19              4.4                4.4
## 20                  20              4.8                4.4
## 21                  21              4.7                4.4
## 22                  22              4.3                4.4
## 23                  23              4.3                4.4
## 24                  24              4.1                4.4
##    forecasts.nonfar realization.nonfar
## 19           250000             156000
## 20           146850             156000
## 21           190000             156000
## 22           182000             156000
## 23           220000             156000
## 24           220000             156000
names(df.forecasts)
## [1] "X1.number.forecasts" "forecasts.unrate"    "realization.unrate" 
## [4] "forecasts.nonfar"    "realization.nonfar"
names(df.forecasts) <- c("index","f.un","r.un","f.nfp","r.nfp")
names(df.forecasts)
## [1] "index" "f.un"  "r.un"  "f.nfp" "r.nfp"
head(df.forecasts)
##   index f.un r.un  f.nfp  r.nfp
## 1     1  3.0  4.4 160000 156000
## 2     2  5.7  4.4 208000 156000
## 3     3  5.2  4.4 190000 156000
## 4     4  4.8  4.4 215000 156000
## 5     5  4.4  4.4 202000 156000
## 6     6  4.1  4.4 200000 156000
tail(df.forecasts,n=3)
##    index f.un r.un  f.nfp  r.nfp
## 22    22  4.3  4.4 182000 156000
## 23    23  4.3  4.4 220000 156000
## 24    24  4.1  4.4 220000 156000
summary(df.forecasts)
##      index            f.un            r.un         f.nfp       
##  Min.   : 1.00   Min.   :3.000   Min.   :4.4   Min.   :146830  
##  1st Qu.: 6.75   1st Qu.:4.200   1st Qu.:4.4   1st Qu.:157500  
##  Median :12.50   Median :4.300   Median :4.4   Median :192500  
##  Mean   :12.50   Mean   :4.417   Mean   :4.4   Mean   :189799  
##  3rd Qu.:18.25   3rd Qu.:4.550   3rd Qu.:4.4   3rd Qu.:216250  
##  Max.   :24.00   Max.   :5.700   Max.   :4.4   Max.   :250000  
##      r.nfp       
##  Min.   :156000  
##  1st Qu.:156000  
##  Median :156000  
##  Mean   :156000  
##  3rd Qu.:156000  
##  Max.   :156000

Question 1

What is the mean of your unemployment forecasts? (0.25 points)

In the answers to the questions I will give you a variety of ways to answer the questions in R in order to familiarize you with the R language syntax. Whenever you wonder what a certain command such as mean() does, type ?mean() in the R console and press ‘Enter’.

mean(forecasts.unrate)
## [1] 4.416667
mean(df.forecasts$f.un)
## [1] 4.416667
print(mean(forecasts.unrate), digits = 2)
## [1] 4.4
print(mean(df.forecasts$f.un), digits = 2)
## [1] 4.4
print(paste("The mean of our unemployment forecasts is: ",mean(forecasts.unrate),sep=""))
## [1] "The mean of our unemployment forecasts is: 4.41666666666667"

Question 2

What is the median of your nonfarm payroll forecasts? (0.25 points)

median(forecasts.nonfar)
## [1] 192500
median(df.forecasts$f.nfp)
## [1] 192500
print(format(median(forecasts.nonfar),big.mark=","))
## [1] "192,500"
print(format(median(df.forecasts$f.nfp),big.mark=","))
## [1] "192,500"
print(paste("The median of our nonfarm payroll forecasts is: ",median(forecasts.nonfar),sep=""))
## [1] "The median of our nonfarm payroll forecasts is: 192500"

Question 3

What is the interquartile range of the nonfarm payroll forecast? (0.25 points)

IQR(forecasts.nonfar)
## [1] 58750
IQR(df.forecasts$f.nfp)
## [1] 58750
print(format(IQR(forecasts.nonfar),big.mark=","))
## [1] "58,750"
print(format(IQR(df.forecasts$f.nfp),big.mark=","))
## [1] "58,750"
quantile(forecasts.nonfar, probs = seq(0,1,0.25))[4]-quantile(forecasts.nonfar, probs = seq(0,1,0.25))[2]
##   75% 
## 58750
print(paste("The interquartile range is of the nonfarm payroll forecasts is: ",IQR(forecasts.nonfar),sep=""))
## [1] "The interquartile range is of the nonfarm payroll forecasts is: 58750"

Question 4

What is the interquintile range of the nonfarm payroll forecast? (0.25 points)

f.quintiles <- quantile(forecasts.nonfar, probs = seq(0,1,0.2))
interquintile.range <- f.quintiles[5] - f.quintiles[2]

# or directly
interquintile.range <- quantile(forecasts.nonfar, probs = seq(0,1,0.2))[5]-quantile(forecasts.nonfar, probs = seq(0,1,0.2))[2]

print(paste("The interquintile range is: ",interquintile.range,sep=""))
## [1] "The interquintile range is: 70800"

Question 5

What is the interdecile range of the nonfarm payroll forecast? (0.5 points)

quantile(forecasts.nonfar, probs = seq(0,1,0.1))
##     0%    10%    20%    30%    40%    50%    60%    70%    80%    90% 
## 146830 147000 149200 174400 183600 192500 200000 208700 220000 222800 
##   100% 
## 250000
interdecile.range <- quantile(forecasts.nonfar, probs = seq(0,1,0.1))[10]-quantile(forecasts.nonfar, probs = seq(0,1,0.1))[2]
interdecile.range
##   90% 
## 75800
print(paste("The interdecile range is: ",interdecile.range,sep=""))
## [1] "The interdecile range is: 75800"

Question 6

What is the maximum of the unemployment forecast? (0.25 points)

max(forecasts.unrate)
## [1] 5.7
print(paste("The maximum of the unemployment forecast is: ",max(forecasts.unrate),sep=""))
## [1] "The maximum of the unemployment forecast is: 5.7"