<- rio::import("https://byuistats.github.io/timeseries/data/co2_mm_mlo.csv") c02
Time Series Homework: Chapter 5 Lesson 3
Please_put_your_name_here
Data
Questions
Question 1 - Context and Measurement (5 points)
The first part of any time series analysis is context. You cannot properly analyze data without knowing what the data is measuring. Without context, the most simple features of data can be obscure and inscrutable. This homework assignment will center around the series below.
Please research the time series. In the spaces below, give the data collection process, unit of analysis, and meaning of each observation for the series.
a) Atmospheric Carbon Dioxide
Question 2 - Seasonal Pattern Exploration (20 points)
a) Plot the Atmospheric Carbon Dioxide series.
b) Create box plot of the seasonal variation in co2 atmospheric measurements.
c) Please explain three likely factors that drive co2 seasonal patterns.
Question 3 - Model Selection: Additive Harmonic Seasonal Variables (50 points)
a) Using the atmospheric co2 time series, please estimate a linear model with a linear trend and harmonic seasonal variables. Include three models, one with the complete set of harmonic variables, and two with reduced harmonic components. Please use the Time Series Notebook Ch5 Lesson 3 result table format when presenting your results that include a column that identifies the variable as significant.
b) Using the atmospheric co2 time series, please estimate a linear model with a quadratic trend and harmonic seasonal variables. Include three models, one with the complete set of harmonic variables, and two with reduced harmonic components. Please use the Time Series Notebook Ch5 Lesson 3 result table format when presenting your results that include a column that identifies the variable as significant.
c) Using the atmospheric co2 time series, please estimate a linear model with a exponential trend and harmonic seasonal variables. Include three models, one with the complete set of harmonic variables, and two with reduced harmonic components. Please use the Time Series Notebook Ch5 Lesson 3 result table format when presenting your results that include a column that identifies the variable as significant.
d) Please use AIC, AICc, and BIC to help you argue for the best model to fit the atmospheric co2 data. Please include a table similar to the one found in the Model Comparison section of Time Series Notebook Ch5 Lesson 3. Make sure you take into account the discussion on the dangers of only using algorithms for model selection found on the Time Series notebook section on model selection.
Rubric
Criteria | Mastery (5) | Incomplete (0) | |
Question 1: Context and Measurement |
The student thoroughly researches the data collection process, unit of analysis, and meaning of each observation for both the requested time series. Clear and comprehensive explanations are provided. | The student does not adequately research or provide information on the data collection process, unit of analysis, and meaning of each observation for the specified series. | |
Mastery (5) | Incomplete (0) | ||
Question 2a: Time series plot |
Students plot the Atmospheric Carbon Dioxide series, ensuring high-quality visualization with clear labels and titles. | Submissions have low-quality visualizations or unclear labeling. | |
Mastery (5) | Incomplete (0) | ||
Question 2b: Box Plot |
Students create a box plot of the seasonal variation in CO2 atmospheric measurements, providing clear interpretation and labeling. | Submissions have low-quality visualizations or unclear labeling. | |
Mastery (10) | Incomplete (0) | ||
Question 2c: Seasonal Patterns |
Students provide a clear and accurate explanation of three likely factors that drive CO2 seasonal patterns, demonstrating an understanding of the underlying time series and relevant environmental and ecological processes. | Students provide incomplete or inaccurate explanations of factors that drive CO2 seasonal patterns or fail to demonstrate an understanding of the data generation process and relevant environmental and ecological processes. | |
Mastery (10) | Incomplete (0) | ||
Question 3a: Linear Trend Harmonic Seasonal Variables |
Students accurately estimate three linear models with a linear trend and harmonic seasonal variables, providing clear presentation of results including a column identifying significant variables. | Students fail to estimate one or more of the models requested. The presentation of the results is unclear or incomplete, or fail to identify significant variables appropriately. | |
Mastery (10) | Incomplete (0) | ||
Question 3b: Cuadratic Trend Harmonic Seasonal Variables |
Students accurately estimate three linear models with a quadratic trend and harmonic seasonal variables, providing clear presentation of results including a column identifying significant variables. | Students fail to estimate one or more of the models requested. The presentation of the results is unclear or incomplete, or fail to identify significant variables appropriately. | |
Mastery (10) | Incomplete (0) | ||
Question 3c: Exponential Trend Harmonic Seasonal Variables |
Students accurately estimate three linear models with an exponential trend and harmonic seasonal variables, providing clear presentation of results including a column identifying significant variables. | Students fail to estimate one or more of the models requested. The presentation of the results is unclear or incomplete, or fail to identify significant variables appropriately. | |
Mastery (20) | Incomplete (0) | ||
Question 3d: Model Selection |
Students effectively use AIC, AICc, and BIC to compare and evaluate models, presenting results in a clear table format similar to the one found in the Model Comparison section of the Time Series Notebook Ch5 Lesson 3. Their discussions on the nuance model selection evidences they understand the importance of considering the context and data generating process that is part of model specification. | Students struggle to effectively use AIC, AICc, and BIC to compare and evaluate models, resulting in unclear or incomplete presentation of results or failure to address the dangers of relying solely on algorithms for model selection. |
|
Total Points | 75 |