A comprehensive exploration of deterministic processes in time series forecasting using statsmodels, featuring trend modeling, seasonality, Fourier terms, custom components, and integration with AutoReg and SARIMAX models.
This project provides an in-depth modeling deterministic components in time series data, including trends, seasonality, and custom deterministic patterns. The implementation leverages statsmodels.tsa.deterministic module to create reproducible forecasting workflows with both in-sample fitting and out-of-sample prediction capabilities.
- Flexible Deterministic Modeling: Create deterministic processes with constants, polynomial trends, seasonal dummies, and Fourier terms
- Date Index Support: Seamless integration with
DateTimeIndexandPeriodIndexfor time-aware forecasting - Custom Component Development: Extend base classes to create specialized deterministic terms for domain-specific patterns
- Model Integration: Direct support for
AutoRegmodels and manual integration withSARIMAXand other exogenous-aware models - Advanced Patterns: Implement broken trends, polynomial expansions, and interaction terms
- Diagnostic Tools: Collinearity checks and visualization utilities for deterministic components
Deterministic terms represent predictable, non-random patterns in time series data:
- Constant: Baseline level
- Time Trend: Linear, quadratic, or higher-order polynomial trends
- Seasonality: Periodic patterns captured via dummy variables or Fourier series
- Custom Terms: User-defined deterministic structures
A time series
The deterministic component can be expressed as: $$ \mu_t = \beta_0 + \beta_1 t + \sum_{j=1}^m \left[ \alpha_j \sin\left(\frac{2\pi j t}{P}\right) + \gamma_j \cos\left(\frac{2\pi j t}{P}\right) \right] + \sum_{k=1}^K \delta_k S_k(t) $$
pip install statsmodels pandas numpy matplotlibfrom statsmodels.tsa.deterministic import DeterministicProcess
import pandas as pd
index = pd.RangeIndex(0, 100)
det_proc = DeterministicProcess(
index,
constant=True,
order=1, # linear trend
seasonal=True,
period=5
)
in_sample = det_proc.in_sample()
forecast = det_proc.out_of_sample(steps=20)from statsmodels.tsa.deterministic import DeterministicTerm
import numpy as np
class BrokenTimeTrend(DeterministicTerm):
def __init__(self, break_period: int):
self._break_period = break_period
def in_sample(self, index: pd.Index):
nobs = index.shape[0]
terms = np.zeros((nobs, 2))
terms[self._break_period:, 0] = 1
terms[self._break_period:, 1] = np.arange(self._break_period + 1, nobs + 1)
return pd.DataFrame(terms, columns=["const_break", "trend_break"], index=index)from statsmodels.tsa.api import AutoReg
mod = AutoReg(y, lags=1, trend="n", deterministic=det_proc)
res = mod.fit()
forecast = res.predict(start=200, end=211)- Multi-Component Processes: Combining trends, multiple seasonal patterns, and Fourier terms
- Date-Based Indexing: Working with temporal indices and date ranges
- Custom Term Development: Extending
DeterministicTermfor specialized patterns - Model Integration Strategies: Both direct (AutoReg) and manual (SARIMAX) approaches
- Diagnostic Validation: Condition number analysis for multicollinearity detection
- Visualization: Component-wise plotting for interpretability
- Alternative Approaches: Polynomial trends, interaction terms, and exogenous data wrapping
- Basic Usage: Foundational examples of deterministic process creation
- Date Index Handling: Temporal indexing and forecasting techniques
- Advanced Construction: Complex deterministic patterns and custom components
- Model Integration: Statistical model fitting and forecasting workflows
- Alternative Approaches: Extended methodologies and custom implementations
- Validation Tools: Diagnostic and visualization utilities
- Economic Forecasting: Modeling business cycles, trends, and seasonal effects
- Climate Time Series: Capturing annual and multi-year periodic patterns
- Retail Analytics: Weekly, monthly, and holiday seasonality modeling
- Energy Demand Forecasting: Daily and seasonal load pattern decomposition
- Financial Time Series: Trend analysis and periodic component extraction
statsmodels >= 0.13.0pandas >= 1.3.0numpy >= 1.20.0matplotlib >= 3.4.0
- StatsModels Documentation: Deterministic Terms in Time Series Models
- Hyndman, R.J., & Athanasopoulos, G. (2021). Forecasting: Principles and Practice
- Box, G.E.P., Jenkins, G.M., Reinsel, G.C., & Ljung, G.M. (2015). Time Series Analysis: Forecasting and Control
This project is licensed under the MIT License - see the LICENSE file for details.
This notebook is designed for educational and research purposes. Real-world applications may require additional considerations and validation.