Time Series Homework: Chapter 2 Lesson 2

Please_put_your_name_here

Data

Code
manu_inv <- rio::import("https://byuistats.github.io/timeseries/data/manu_mat_invent.csv")

Questions

Question 1 - Context and Measurement (10 points)

The first part of any time series analysis is context. You cannot properly analyze data without knowing what the data is measuring. Without context, the most simple features of data can be obscure and inscrutable. This homework assignment will center around the series below.

Please research the time series. In the spaces below, give the data collection process, unit of analysis, and meaning of each observation for the series.

Manufacturers’ Materials and Supplies Inventories

https://fred.stlouisfed.org/series/UMTMMI

Answer

Data Collection Process: The Manufacturers’ Shipments, Inventories, and Orders (M3) Survey is conducted by the U.S. Census Bureau. This survey collects data from manufacturers across a broad range of industries in the United States. The goal is to measure the value of shipments, inventories, and unfilled orders, along with new orders received. The data is gathered directly from manufacturers on a monthly basis and is reported in millions of dollars. It provides critical information regarding industrial activity and business conditions.

Unit of Analysis: The unit of analysis for this time series is millions of dollars. Each data point represents a specific measure of activity in the U.S. manufacturing sector, such as the total value of shipments, inventories, or orders. These measures are reported monthly at the end of the period, reflecting the economic conditions in the manufacturing industry during that month.

Meaning of Each Observation: Each observation in the time series represents the total value (in millions of dollars) of a specific manufacturing activity during the end of a particular month. The survey tracks several categories:

Shipments: The total dollar value of products shipped by manufacturers.

Inventories: The value of products held in inventory by manufacturers.

Orders: New orders received and unfilled orders for manufactured goods.

This data helps to assess the current state and future trends in the U.S. manufacturing sector.

Question 2 - Manufacturer’s Inventory: Autocorrelation and autocovariance (10 points)

a) Please calculate the list of autocorrelation and autocovariance values for the Manufacturer’s Inventory series.

Answer
Here is simple code for covariance/autocorrelation (like in the textbook)
# Autocovariances 
acf(manu_inv$manu_inv, plot=FALSE, type = "covariance")

Autocovariances of series 'manu_inv$manu_inv', by lag

       0        1        2        3        4        5        6        7 
2.39e+09 2.36e+09 2.33e+09 2.30e+09 2.27e+09 2.24e+09 2.20e+09 2.16e+09 
       8        9       10       11       12       13       14       15 
2.13e+09 2.09e+09 2.05e+09 2.01e+09 1.97e+09 1.93e+09 1.89e+09 1.85e+09 
      16       17       18       19       20       21       22       23 
1.81e+09 1.77e+09 1.73e+09 1.69e+09 1.65e+09 1.61e+09 1.58e+09 1.54e+09 
      24       25 
1.51e+09 1.48e+09 
Here is a different approach to output covariance/autocorrelation in a tabular format
### Using the broom::tidy() function we can convert the list output of acf() into a nice datatable (similar to pander)
### We can then use kable() to display and caption this datatable in markdown.

# Autocorrelations Table
kable(broom::tidy(acf(manu_inv$manu_inv, plot=FALSE, type="correlation")), 
      caption = "Autocorrelations", digits = 3)
Autocorrelations
lag acf
0 1.000
1 0.988
2 0.976
3 0.963
4 0.949
5 0.934
6 0.920
7 0.904
8 0.889
9 0.873
10 0.856
11 0.840
12 0.824
13 0.807
14 0.789
15 0.772
16 0.755
17 0.738
18 0.721
19 0.704
20 0.689
21 0.674
22 0.659
23 0.645
24 0.632
25 0.619

b) If autocovariance and autocorrelation are trying to evaluate a similar linear relationship across time in our series, why do we get different values for autocorrelation and autocovariance at the same lag.

Answer

The reason we get different values for autocorrelation and autocovariance at the same lag, even though both aim to evaluate the linear relationship across time in a series, is due to normalization.

This is because the autocovariance will depend on the size and scale of the data, meaning if the time series has large numbers, the autocovariance will correspond significantly larger.

When we compare this then to the autocorrelation, this is a number that is normalized to be a dimensionless number between -1 and 1. It adjusts the autocovariance to a scale that allows for comparison across different datasets or lags. So, while both measure the linear relationship at a given lag, autocovariance reflects the absolute magnitude of that relationship, while autocorrelation reflects the relative strength of that relationship on a standardized scale.

Question 3 - Manufacturer’s Inventory: Stationary (20 points)

Weak stationarity is a form of stationarity important for the analysis of time series data. A time series is said to be weakly stationary if its statistical properties such as mean, variance, and autocovariance are constant over time. Here are the key components of weak stationarity:

Constant Mean: The mean of the time series remains constant over time. This doesn’t necessarily mean that the time series is centered around zero; it just implies that the average value remains the same throughout the observed period.

Constant Variance: The variance of the time series is uniform across all time points. Like the mean, this doesn’t imply that the variance must be zero, just that it doesn’t change systematically with time.

Constant Autocovariance: The autocovariance between any two observations of the time series depends only on the time lag between them and not on the absolute positions of the observations in time. This implies that the dependence structure of the time series remains constant over time.

a) Please split the time series into two halves according to the date recorded, the earlier half of the data and the latter part of the data. Calculate the mean, variance, and autocovariance for each half. Note: (it doesn’t really matter if it’s precisely half. An approximate middle is sufficient.)

Answer
Code
median_date <- median(manu_inv$date)

df_early <- manu_inv %>%
  filter(date <= median_date)

df_late <- manu_inv %>%
  filter(date > median_date)

mean_early <- mean(df_early$manu_inv)
variance_early <- var(df_early$manu_inv)
acf_early <- acf(df_early$manu_inv, type = "covariance", plot = FALSE)

mean_late <- mean(df_late$manu_inv)
variance_late <- var(df_late$manu_inv)
acf_late <- acf(df_late$manu_inv, type = "covariance", plot = FALSE)

cat("Early Half:\n",
    "Mean:", mean_early, "\n",
    "Variance:", variance_early, "\n")
Early Half:
 Mean: 187241.2 
 Variance: 2356124713 
Code
acf_early

Autocovariances of series 'df_early$manu_inv', by lag

       0        1        2        3        4        5        6        7 
2.34e+09 2.28e+09 2.22e+09 2.15e+09 2.07e+09 2.00e+09 1.92e+09 1.84e+09 
       8        9       10       11       12       13       14       15 
1.76e+09 1.69e+09 1.61e+09 1.55e+09 1.49e+09 1.42e+09 1.36e+09 1.33e+09 
      16       17       18       19       20       21       22 
1.29e+09 1.26e+09 1.23e+09 1.21e+09 1.18e+09 1.16e+09 1.13e+09 
Code
cat("Late Half:\n",
    "Mean:", mean_late, "\n",
    "Variance:", variance_late, "\n")
Late Half:
 Mean: 188964.2 
 Variance: 2453863580 
Code
acf_late

Autocovariances of series 'df_late$manu_inv', by lag

       0        1        2        3        4        5        6        7 
2.44e+09 2.38e+09 2.31e+09 2.24e+09 2.17e+09 2.10e+09 2.02e+09 1.94e+09 
       8        9       10       11       12       13       14       15 
1.86e+09 1.77e+09 1.69e+09 1.61e+09 1.54e+09 1.49e+09 1.43e+09 1.39e+09 
      16       17       18       19       20       21       22 
1.34e+09 1.30e+09 1.27e+09 1.24e+09 1.22e+09 1.19e+09 1.17e+09 

b) Is there evidence to suggest that the Manufacturer’s Inventory series is weakly stationary?

Answer

Because the mean, variance, and autocovariance values are relatively consistent across the two halves of the data, it would be reasonable to state that this series exhibits weak stationarity. While there are small variations, these changes are not significant enough to clearly indicate non-stationarity.

c) The variance function for a times series, \(\sigma^2(t)=E[(x_t-\mu)^2]\), is defined for the entire ensemble. Why is determining whether a time series has constant variance so difficult using sample data?

Answer

The core difficulty in assessing constant variance lies in distinguishing between true changes in the variance function over time (heteroscedasticity) and random fluctuations inherent in a single realization, even if the underlying variance function is constant (homoscedasticity).

Here are two examples of how you could explain or present this in greater deta detail:

Rubric

Criteria Mastery (10) Incomplete (0)
Question 1: Context and Measurement The student thoroughly researches the data collection process, unit of analysis, and meaning of each observation for both the requested time series. Clear and comprehensive explanations are provided. The student does not adequately research or provide information on the data collection process, unit of analysis, and meaning of each observation for the specified series.
Mastery (5) Incomplete (0)
Question 3a: Autocorrelation and Covariance The student correctly computes the autocorrelation and autocovariance values for the Manufacturer’s Inventory series using R.The R code is well-commented and structured, facilitating understanding of each step in the calculation process. Results are presented clearly. The student attempts to compute autocorrelation and autocovariance values for the Manufacturer’s Inventory series, but significant errors are present in the computations. The R code lacks clear documentation, with unclear or missing comments that hinder comprehension of the calculation process. Presentation of results may be confusing or incomplete, making it challenging to interpret the autocorrelation and autocovariance values accurately.
Mastery (5) Incomplete (0)
Question 3b:Theoretical understanding The student provides a clear and accurate explanation of why different values are obtained for the same lag of the autocorrelation and autocovariance estimates. The explanation demonstrates a solid understanding of the underlying concepts. The student attempts to explain why different values are obtained for the same lag of the autocorrelation and autocovariance estimates but does so with significant inaccuracies or lack of clarity. The explanation may lacks coherence or fails to address key differences between autocorrelation and autocovariance adequately.
Mastery (5) Incomplete (0)
Question 4a: Stationarity Calculations The student accurately splits the dataset into two parts and calculates the mean, variance, and autocovariance for each part using R. The R code is well-commented, providing clear explanations of the steps taken to perform the analysis. The calculated statistics are presented clearly, aiding interpretation of the results, and the student shows a solid understanding of the concepts involved in analyzing time series data. The student attempts to split the dataset into two parts and calculate the mean, variance, and autocovariance for each part using R, but does so with significant errors or inaccuracies. The R code lacks clear and sufficient commenting, making it difficult to understand the steps taken in the analysis. The calculated statistics may be presented poorly or inaccurately, indicating a limited understanding of the concepts involved in analyzing time series data.
Mastery (5) Incomplete (0)
Question 4b: Evaluation The student assesses whether there is evidence to suggest that the Manufacturer’s Inventory series is weakly stationary. The analysis is supported by clear and concise explanations, demonstrating a solid understanding of the concept of weak stationarity. The student attempts to assess whether the Manufacturer’s Inventory series is weakly stationary but does so with significant errors or lacks clarity in their analysis. There may be inaccuracies in the methodology or misinterpretation of results, indicating a limited understanding of weak stationarity
Mastery (10) Incomplete (0)
Question 4c: Evaluation The students understand the definition and application of a time series variance function to an ensemble. The submission doesn’t provide enough evidence of understanding of the definition and application of the variance function.
Total Points 40