11. Time Series Analysis
Time series analysis extracts temporal patterns from sequential data. Pandas provides effective datetime handling and resampling capabilities for this work.
Parsing and Indexing Datetime
# Parse dates during load
df = pd.read_csv('sensor_readings.csv', parse_dates=['timestamp'], index_col='timestamp')
# Or convert existing column
df['datetime'] = pd.to_datetime(df['date_string'], format='%Y-%m-%d %H:%M:%S')
df.set_index('datetime', inplace=True)
# Flexible date range filtering
monthly_data = df['2024-03-01':'2024-06-30']
Resampling for Aggregation
# Aggregate to different frequencies
hourly_avg = df.resample('H').mean() # Hourly average
daily_max = df.resample('D').max() # Daily maximum
weekly_sum = df.resample('W').sum() # Weekly total
# Custom aggregation functions
monthly_stats = df.resample('M').agg({
'temperature': ['mean', 'std', 'min', 'max'],
'pressure': 'median'
})
Rolling Windows for Smoothing
Rolling calculations reveal trends while filtering noise:
# 7-day rolling average
df['rolling_avg'] = df['value'].rolling(window='7D', min_periods=1).mean()
# Exponentially weighted moving average (EWMA) - more responsive
df['ewma'] = df['value'].ewm(span=7, adjust=False).mean()
# Rolling standard deviation for volatility
df['volatility'] = df['value'].rolling(window='30D', min_periods=15).std()
Decomposition
Separate time series into trend, seasonal, and residual components:
from statsmodels.tsa.seasonal import seasonal_decompose
# Additive decomposition for consistent seasonality
decomposition = seasonal_decompose(df['sales'], model='additive', period=30)
fig, (ax1, ax2, ax3, ax4) = plt.subplots(4, 1, figsize=(12, 10))
decomposition.observed.plot(ax=ax1, title='Observed')
decomposition.trend.plot(ax=ax2, title='Trend')
decomposition.seasonal.plot(ax=ax3, title='Seasonal')
decomposition.resid.plot(ax=ax4, title='Residual')
plt.tight_layout()
plt.savefig('decomposition.png')
Autocorrelation Function (ACF)
ACF reveals how values correlate with their own past:
from statsmodels.graphics.tsaplots import plot_acf, plot_pacf
fig, (ax1, ax2) = plt.subplots(1, 2, figsize=(14, 4))
plot_acf(df['value'].dropna(), ax=ax1, lags=40)
plot_pacf(df['value'].dropna(), ax=ax2, lags=40)
plt.tight_layout()
Local verification checkpoint
Run the smallest example from this chapter in a local workspace and record the package version, runtime, data path, and observed output. If the result depends on model size, vector count, CPU/GPU backend, or available memory, note that constraint beside the exercise so the lesson remains reproducible.
Local verification checkpoint
Run the smallest example from this chapter in a local workspace and record the package version, runtime, data path, and observed output. If the result depends on model size, vector count, CPU/GPU backend, or available memory, note that constraint beside the exercise so the lesson remains reproducible.
Load a timestamped dataset, decompose it using seasonal_decompose with period=24, and interpret which components dominate the signal.