Time Series Homework: Chapter 1 Lesson 2

Please_put_your_name_here

Data

# Macroeconomic Data: unemployment rate
unemp_rate <- rio::import("https://byuistats.github.io/timeseries/data/unemp_rate.csv")

Questions

Question 1 - Context and Measurement (10 points)

The first part of any time series analysis is context. You cannot properly analyze data without knowing what the data is measuring. Without context, the most simple features of data can be obscure and inscrutable. This homework assignment will center around the US unemployment series.

Please research the time series. The subheadings below has a link to a source to help you get started. In the spaces below, give the data collection process, unit of analysis, and meaning of each observation for the US unemployment time series.

a) US unemployment

https://www.bls.gov/cps/cps_htgm.htm

Answer

Data collection process: The primary method for collecting US unemployment data is the Current Population Survey (CPS). This is a monthly survey conducted by the U.S. Census Bureau on behalf of the Bureau of Labor Statistics (BLS).

Sampling: The CPS uses a large rotating sample of approximately 60,000 eligible households (about 110,000 individuals) selected to represent the entire US population.

Unit of Analysis: The unit of analysis is an individual person within the sampled households. Each person aged 16 and older is classified into one of the categories: employed, unemployed, or not in the labor force.

Meaning of each Observation: A given observation represents the percentage of unemployed persons within the labor force during any given month. \[ \frac{Unemployed}{Employed+Unemployed}\]

Question 2 - Estimating the Trend: Annual Aggregation (10 points)

Please plot the US Unemployment time series and superimpose the annual mean of the series in the same graph. Use the appropriate axis labels, units, and captions.

Answer
###########################3 cut in half
# extracts the year out of the date
unemp_rate$Year <- year(unemp_rate$date)

# Calculate the mean value by year
annual_mean <- unemp_rate %>%
  group_by(Year) %>%
  summarise(Annual_Mean = mean(value))


# plots the data and the annual mean in ggplot
ggplot(data= unemp_rate %>% filter(date >= "1980-01-01" & date <= "2023-12-01" )) +
  geom_line(aes(x = date, y = value), color = "black", linetype = "solid", size = 0.5) +
    geom_line(data = annual_mean %>% filter(Year >=1980), aes(x = as.Date(paste0(Year, "-07-01")), y = Annual_Mean), 
            color = "blue", linetype = "solid", size = 0.7) +
  labs(title = "US Unemployment Rate Over Time with Annual Mean",
       subtitle = "1980-2023",
       x = "Date", y = "Unemployment Rate (%)") +
    theme_minimal() +
    labs(caption = "Blue Line: Annual Mean \nSolid black Line: Monthly Unemployment Rate") +
    scale_y_continuous(labels = scales::percent_format(scale = 1))
Warning: Using `size` aesthetic for lines was deprecated in ggplot2 3.4.0.
ℹ Please use `linewidth` instead.

Question 3 - Trend Analysis (30 points)

a) Describe the US Unemployment time series and its trend. Comment on the time series trend characteristics, seasonal variation, and cycle.
Answer

Trend Characteristics

This series exhibits significant stochastic patterns and elements. While the series fluctuates significantly, it generally fluctuates or trends towards the same value, approximately 5%, and does not exhibit a significant long-term upward or downward movement. This makes sense when we connect the context of our data, unemployment rates cannot realistically be 0% and rarely if ever surpass 20%, it is generally accepted that a natural unemployment rate or static level of roughly 5% exists.

Seasonal Variation

It is difficult to understand or visualize seasonality when we have a large number of seasons contained within our full timeframe.

By “zooming” into our data (filtering to a smaller timeframe) and looking at just a 6 year period we can clearly see annual seasonality with two seasonal periods per year (once around Jan and again around July). This process is likely tied to seasonal hiring in Sep-Nov for holiday work (lowering unemployment) & an influx of recent graduates in Apr/May (increasing unemployment).

Here are two examples of how you can visualize or present this:

Cycle

Referencing the full series displayed in Q2 significant and dynamic cyclical patterns can be observed, they were more frequent during the early 1900’s but have slowed down and become more gradual after the 1970’s.

b) What does the trend represent? What do you suspect is causing the patterns in the trend? Hint: Research the Natural Unemployment Rate
Answer

The natural unemployment rate represents the baseline level of unemployment in a healthy economy, it is estimated to be roughly 4.5% and consists of frictional & structural unemployment but not cyclical unemployment. When the unemployment rate is above the natural rate, we would expect it to trend downward as the economy adjusts, and similarly, if it is below the natural rate, it is likely to trend upward. The remaining fluctuations are likely influenced by economic cycles, labor market conditions, and policy decisions and other unpredictable events.

c) Please justify whether the trend is deterministic or stochastic.
Answer

The trend of this series is a horizontal line at roughly 5%, and is therefore deterministic, the series as a whole however is stochastic. This is due to the overwhelming impact of economic cycles and other random shifts in the series.

Rubric

Criteria Mastery (10) Incomplete (0)
Question 1: Context and Measurement The student demonstrates a clear understanding of the context for the data series (US unemployment). The explanation includes details about the data collection process, unit of analysis, and the meaning of each observation. The student fails to provide a clear understanding of the context for one or more data series.
Mastery (10) Incomplete (0)
Question 2: Data Visualization The student creates clear and informative plots to visualize the time series and trend. The plot is professional and at a minimum includes the appropriate axis labels, units, and captions The student does not include visualizations or the visualizations provided are unclear and do not effectively represent the data. The plot doesn’t include one or more of the required components.
Mastery (10) Incomplete (0)
Question 3a: Data description The student uses appropriate technical language to describe the main features of time series data including sampling interval, time series trend, seasonal variation and cycle. The student does not use technical language to describe the main features of time series data or does not define one or more specified terms.
Mastery (10) Incomplete (0)
Question 3b: Causes of US Unemployment trend Demonstrates understanding of potential factors influencing the patterns in the trend. Their understanding of potential factors shows understanding of the underlying data, it’s source, and required independent research. Fails to demonstrate any understanding of potential factors influencing the patterns in the trend. Shows a lack of awareness of the underlying data, its source, and the need for independent research.
Mastery (10) Incomplete (0)
Question 3c: Trend Classification Shows understanding of the relevant concepts. Accurately justifies the trend’s classification. Shows a good understanding of the definitions deterministic and stochastic trends. Fails to demonstrate any understanding of the relevant concepts or provides no accurate justification for the trend’s classification.
Mastery (10) Incomplete (0)
General: R Programming Usage The student effectively utilizes R to complete the assignment. Code snippets or outputs are appropriately included to support the analysis. Code is annotated and commented enough for a third party to understand and evaluate easily. One or more code snippets or outputs are missing. Code is minimally annotated and commented, making it challenging for a third party to understand and evaluate. The code doesn’t work or render.
Total Points 60