Time Series Homework: Chapter 3 Lesson 4

Please_put_your_name_here

Data

avgkwhr <- rio::import("https://byuistats.github.io/timeseries/data/USENERGYPRICE.csv")

Questions

Question 1 - Context and Measurement (10 points)

The first part of any time series analysis is context. You cannot properly analyze data without knowing what the data is measuring. Without context, the most simple features of data can be obscure and inscrutable. This homework assignment will center around the series below.

Please research the time series. In the spaces below, give the data collection process, unit of analysis, and meaning of each observation for the series.

Average Price of Electricity per Kilowatt-Hour in U.S.: City Average

https://fred.stlouisfed.org/series/APU000072610

Answer

Data Collection Process

The data is sourced from the U.S. Bureau of Labor Statistics (BLS) under the release “Average Price: Electricity per Kilowatt-Hour in U.S. City Average. The prices are collected monthly from 75 urban areas across the United States using mail questionnaires managed by the Department of Energy. All reported prices include Federal, State, and local taxes, as well as fuel and purchased gas adjustments for natural gas and electricity.

Unit of Analysis

The primary unit is the U.S. City Average electricity price ($) per kilowatt-hour (kWh) across the 75 urban areas. Prices are measured in U.S. Dollars (USD) and are not seasonally adjusted, reflecting actual monthly prices without adjustments for seasonal variations.

Meaning of Each Observation

Each data point represents the average electricity price per kilowatt-hour for a specific month in a U.S. city.

Question 2 - US Average Price of Electricity: Additive Holt-Winters Forecasting (25 points)

a) Please use the Holt-Winters smoothing method to the series.

We will discuss this more in class but for those who are curious in this dataset originally there was no value for September 1985. In this case with only one seemingly random value missing from our data the implications of how we fill the value are less impactful, regardless the method chosen here was to average the values from \(t_{-1}\) & \(t_{+1}\) then use the resulting value to fill the missing data. This can be accomplished quickly using the following base R code:

# Average the values from the periods before(82) and after(84) the missing period(83).
value <- (avgkwhr_tsbl$usenergyprice[82]+avgkwhr_tsbl$usenergyprice[84])/2
# this evaluates to 0.835

# Fill missing value with average of surrounding values
avgkwhr_tsbl$usenergyprice[83] <- value

For your convience the data provided for this Homework assignment has already been filled.

Answer
## Wrangling
avgkwhr$yearmonth <- yearmonth(lubridate::mdy(avgkwhr$date))
avgkwhr_tsbl <- as_tsibble(avgkwhr, index=yearmonth)

# Fill missing value with average of surrounding values
avgkwhr_tsbl$usenergyprice[83] <- 0.0835

# First the series itself was plotted, you can use any method/function for this but before attempting a Holt-Winters decomposition we want to visualize the series and deterine roughly what trend, seasonality, & cycles look like in the series.
autoplot(avgkwhr_tsbl)
Plot variable not specified, automatically selected `.vars = usenergyprice`

avgkwhr_hw <- avgkwhr_tsbl |>
  model(Additive = ETS(usenergyprice ~
        trend("A", alpha = 0.82, beta = 0.01) +
        error("A") +
        season("A", gamma = 0.02),
        opt_crit = "amse", nmse = 1))
report(avgkwhr_hw)
Series: usenergyprice 
Model: ETS(A,A,A) 
  Smoothing parameters:
    alpha = 0.82 
    beta  = 0.01 
    gamma = 0.02 

  Initial states:
       l[0]         b[0]          s[0]       s[-1]       s[-2]       s[-3]
 0.04825881 0.0004340668 -0.0004773127 0.003445831 0.004025062 0.004226155
       s[-4]        s[-5]        s[-6]       s[-7]        s[-8]      s[-9]
 0.004010342 -0.001200003 -0.002229635 -0.00216731 -0.002284053 -0.0019955
       s[-10]       s[-11]
 -0.002675675 -0.002677901

  sigma^2:  0

      AIC      AICc       BIC 
-3655.906 -3655.030 -3597.071 
autoplot(components(avgkwhr_hw))
Warning: Removed 12 rows containing missing values or values outside the scale range
(`geom_line()`).

augment(avgkwhr_hw) |>
  ggplot(aes(x = yearmonth, y = usenergyprice)) +
    #coord_cartesian(xlim = yearmonth(c("1978 Nov","1990 Dec"))) +
    coord_cartesian(xlim = yearmonth(c("2000 Nov","2016 Dec"))) +
    geom_line() +
    geom_line(aes(y = .fitted, color = "Fitted")) +
    labs(color = "")

avgkwhr_hw <- avgkwhr_tsbl |>
  model(Additive = ETS(usenergyprice ~
        trend("A", alpha = 0.01, beta = 0.0001) +
        error("A") +
        season("A", gamma = 0.02),
        opt_crit = "amse", nmse = 1))
report(avgkwhr_hw)
Series: usenergyprice 
Model: ETS(A,A,A) 
  Smoothing parameters:
    alpha = 0.01 
    beta  = 1e-04 
    gamma = 0.02 

  Initial states:
       l[0]         b[0]          s[0]      s[-1]       s[-2]       s[-3]
 0.06314664 0.0001318904 -0.0002283627 0.00315456 0.004239397 0.003979987
       s[-4]        s[-5]       s[-6]        s[-7]        s[-8]        s[-9]
 0.003794928 -0.001439147 -0.00257199 -0.001345245 -0.003035241 -0.002245286
       s[-10]       s[-11]
 -0.001832954 -0.002470646

  sigma^2:  1e-04

      AIC      AICc       BIC 
-1708.733 -1707.856 -1649.897 
autoplot(components(avgkwhr_hw))
Warning: Removed 12 rows containing missing values or values outside the scale range
(`geom_line()`).

augment(avgkwhr_hw) |>
  ggplot(aes(x = yearmonth, y = usenergyprice)) +
    #coord_cartesian(xlim = yearmonth(c("1978 Nov","1984 Dec"))) +
    #coord_cartesian(xlim = yearmonth(c("2010 Nov","2016 Dec"))) +
    geom_line() +
    geom_line(aes(y = .fitted, color = "Fitted")) +
    labs(color = "")

b) What parameters values did you choose for \(\alpha\), \(\beta\), and \(\gamma\). Justify your choice.

Answer

c) Please plot the Holt-Winters forecast of the series for the next 36 months superimposed against the original series. Please see Figure 7 in Chapter 3: Lesson 4

Answer
avgkwhr_forecast <- avgkwhr_hw |>
  forecast(h = "3 years") 
avgkwhr_forecast |>
  autoplot(avgkwhr_tsbl, level = 95) +
  #coord_cartesian(ylim = c(0,5500)) +
  geom_line(aes(y = .fitted, color = "Fitted"),
    data = augment(avgkwhr_hw)) +
  scale_color_discrete(name = "")

d) Is the trend in the US Average Price of Electricity series deterministic or stochastic? What is the basis of your evaluation?

Answer

Evaluting the series as a whole, it contains more deterministic characteristics than stochastic characteristics.

Trend

A loosely constant positive upwards trend is apparent, meaning that overtime the price of electricity is rising. When we examine this with context however it seems strange to say that the cost of electricity itself is rising, in theory it would decrease overtime as we would expect technological advancements to make it easier and more efficient to produce electricity. However in practice this is somewhat offset by pressure to shift towards renewables which are more expensive. The trend we observe is more reasonably linked to the economy, when the price of a good rises overtime independent of changes in supply/demand we are observing inflation. So the trend in this series is not actually a result of energy prices themselves but rather the inflation in the economy the prices were collected from.

Seasonality

The seasonality appears very constant and deterministic, with an annual period, cycling every 12 months, with high values in the summer (everyone turns on the A/C) and lower values in the winter.

Cycles & ‘Random’ Shocks/Events

However stochastic cycles or rather level shocks are observed, ex: 1987, 1998, 2008-2009, & 2014. There’s lots of factors that could be at play here so this is an excellent time to try and harness AI for some help.

Using this prompt and providing a screenshot of the timeseries plot the following potentially related context was obtained.

“I have this data on the average price of energy in the usa from 1980-2020. Im specifically interested in the context and reason that I can observe cycles/level shocks around these time periods: 1987, 1998, 2008-2009, 2014. Can you tell me about world event, political events, policy changes, technological advancements that happened around these timeframes that would be the explanation for these occurrences?”

Question 3 - Real US Average Price of Electricity: Additive Holt-Winters Forecasting (25 points)

The upward trend of the series is mostly due to inflation, the generalized increase in prices throughout the economy. One way to quantify inflation is to use a price index, like the Personal Consumption Expenditures Deflator (PCE).The series HERE shows that prices in the US have climbed steadily over the last 60 years.Because energy is an important part of the economy, it’s likely that energy prices have followed a similar pattern. Adjusting a series with nominal prices, like the price of electricity, to real prices that account for inflation is simple, divide the original series by the price index. The data set imported below is the real price of electricity, which is the US Average Price of Electricity divided by the PCE index excluding food and energy prices (PCEPILFE). Repeat steps a) to d) for the updated series.

Answer
real_avgkwhr <- rio::import("https://byuistats.github.io/timeseries/data/USENERGYPRICE_Real.csv")

a) Please use the Holt-Winters smoothing method to the series.

Answer
real_avgkwhr$yearmonth <- yearmonth(lubridate::mdy(real_avgkwhr$date))
real_avgkwhr_tsbl <- as_tsibble(real_avgkwhr, index=yearmonth)

autoplot(real_avgkwhr_tsbl)
Plot variable not specified, automatically selected `.vars = realprices`

real_avgkwhr_hw <- real_avgkwhr_tsbl |>
  model(Additive = ETS(realprices ~
        trend("A", alpha = 0.3, beta = 0.1) +
        error("A") +
        season("A", gamma = 0.1),
        opt_crit = "amse", nmse = 1))
report(real_avgkwhr_hw)
Series: realprices 
Model: ETS(A,A,A) 
  Smoothing parameters:
    alpha = 0.3 
    beta  = 0.1 
    gamma = 0.1 

  Initial states:
        l[0]         b[0]          s[0]        s[-1]        s[-2]         s[-3]
 0.001374842 3.255448e-05 -3.364265e-05 6.889495e-05 1.368712e-06 -3.662101e-05

  sigma^2:  0

      AIC      AICc       BIC 
-2770.336 -2769.850 -2751.178 
autoplot(components(real_avgkwhr_hw))
Warning: Removed 4 rows containing missing values or values outside the scale range
(`geom_line()`).

augment(real_avgkwhr_hw) |>
  ggplot(aes(x = yearmonth, y = realprices)) +
    #coord_cartesian(xlim = yearmonth(c("1978 Nov","1990 Dec"))) +
    coord_cartesian(xlim = yearmonth(c("2014 Nov","2016 Dec"))) +
    geom_line() +
    geom_line(aes(y = .fitted, color = "Fitted")) +
    labs(color = "")

b) What parameters values did you choose for \(\alpha\), \(\beta\), and \(\gamma\). Justify your choice.

Answer

In comparison to the original series, the real prices seem to be fluctuating around and average value (roughly $0.0014), additionally the cycles/level shifts around the previously mention periods are now much clearer.

So we can set a Low alpha/beta to ignore the cycles? or high alpha/beta because we want to follow them?

When examining the seasonality of the series somewhere around 2020 there seems to be a significant shift in the pattern

c) Please plot the Holt-Winters forecast of the series for the next 12 months superimposed against the original series. Please see Figure 7 in Chapter 3: Lesson 3

Answer

d) Is the trend in the US Average Real Price of Electricity series deterministic or stochastic? What is the basis of your evaluation?

Answer

Rubric

Criteria Mastery (10) Incomplete (0)
Question 1: Context and Measurement The student thoroughly researches the data collection process, unit of analysis, and meaning of each observation for both the requested time series. Clear and comprehensive explanations are provided. The student does not adequately research or provide information on the data collection process, unit of analysis, and meaning of each observation for the specified series.
Mastery (5) Incomplete (0)
Question 3a: HW Smoothing Demonstrate the implementation of the Holt-Winters smoothing method in R, providing well-commented code that clearly explains each step of the algorithm. They correctly specify the necessary parameters, including trend and seasonality components. Students encounter difficulties in accurately implementing the Holt-Winters smoothing method in R. Their code may lack sufficient comments or clarity, making it challenging to understand the implementation process. Additionally, they may overlook important parameters or make errors in the application of the method, leading to inaccuracies in the results.
Mastery (10) Incomplete (0)
Question 3b: Parameter Choice Responses not only specify the chosen parameter values for $\alpha$, $\beta$, and $\gamma$ in the context of the Holt-Winters smoothing method but also correctly identify the purpose of each parameter in their explanation. They provide a thorough justification for each parameter choice, considering factors such as the data characteristics, seasonality patterns, and the desired level of smoothing Student struggles to clearly specify the chosen parameter values for $\alpha$, $\beta$, and $\gamma$. It’s no clear that they understand the purpose of each parameter in their explanation. They may provide limited or vague justification for each parameter choice, lacking consideration of important factors such as data characteristics or seasonality patterns.
Mastery (5) Incomplete (0)
Question 3c: Forecast Plot Responses effectively create a plot of the Holt-Winters forecast for the next 24 months superimposed against the original series in R. The forecasted values align with the original series and display relevant trends and seasonality patterns. Additionally, they appropriately label the axes, title the plot, and provide a clear legend to distinguish between the original series and the forecast. The plot closely resembles Figure 7 in the Time Series Notebook Student encounter challenges in creating a plot of the Holt-Winters forecast. They may struggle with accurately implementing the plotting code, resulting in inaccuracies or inconsistencies in the plotted forecast. Additionally, their plot may lack proper labeling of the axes, a title, or a legend, making it difficult to interpret the information presented. Furthermore, their plot may deviate significantly from Figure 7 in the Time Series Notebook.
Mastery (5) Incomplete (0)
Question 3d: Trend Evaluation The submission demonstrate a clear understanding of the distinction between deterministic and stochastic trends and provide a reasoned argument for their assessment based on the observed data properties. They provide anevaluation of the data characteristics, considering factors such as the presence of consistent patterns or irregular fluctuations over time. Analyses involve visual inspections to identify any discernible patterns or randomness in the trend. Student offers limited insights into the data characteristics, lacking consideration of relevant factors such as patterns or fluctuations over time. Additionally, their evaluation may lack depth or coherence. No plots drawn to evaluate the trend.
Mastery (5) Incomplete (0)
Question 3a: HW Smoothing Demonstrate the implementation of the Holt-Winters smoothing method in R, providing well-commented code that clearly explains each step of the algorithm. They correctly specify the necessary parameters, including trend and seasonality components. Students encounter difficulties in accurately implementing the Holt-Winters smoothing method in R. Their code may lack sufficient comments or clarity, making it challenging to understand the implementation process. Additionally, they may overlook important parameters or make errors in the application of the method, leading to inaccuracies in the results.
Mastery (10) Incomplete (0)
Question 3b: Parameter Choice Responses not only specify the chosen parameter values for $\alpha$, $\beta$, and $\gamma$ in the context of the Holt-Winters smoothing method but also correctly identify the purpose of each parameter in their explanation. They provide a thorough justification for each parameter choice, considering factors such as the data characteristics, seasonality patterns, and the desired level of smoothing Student struggles to clearly specify the chosen parameter values for $\alpha$, $\beta$, and $\gamma$. It’s no clear that they understand the purpose of each parameter in their explanation. They may provide limited or vague justification for each parameter choice, lacking consideration of important factors such as data characteristics or seasonality patterns.
Mastery (5) Incomplete (0)
Question 3c: Forecast Plot Responses effectively create a plot of the Holt-Winters forecast for the next 24 months superimposed against the original series in R. The forecasted values align with the original series and display relevant trends and seasonality patterns. Additionally, they appropriately label the axes, title the plot, and provide a clear legend to distinguish between the original series and the forecast. The plot closely resembles Figure 7 in the Time Series Notebook Student encounter challenges in creating a plot of the Holt-Winters forecast. They may struggle with accurately implementing the plotting code, resulting in inaccuracies or inconsistencies in the plotted forecast. Additionally, their plot may lack proper labeling of the axes, a title, or a legend, making it difficult to interpret the information presented. Furthermore, their plot may deviate significantly from Figure 7 in the Time Series Notebook.
Mastery (5) Incomplete (0)
Question 3d: Trend Evaluation The submission demonstrate a clear understanding of the distinction between deterministic and stochastic trends and provide a reasoned argument for their assessment based on the observed data properties. They provide anevaluation of the data characteristics, considering factors such as the presence of consistent patterns or irregular fluctuations over time. Analyses involve visual inspections to identify any discernible patterns or randomness in the trend. Student offers limited insights into the data characteristics, lacking consideration of relevant factors such as patterns or fluctuations over time. Additionally, their evaluation may lack depth or coherence. No plots drawn to evaluate the trend.
Total Points 60