11. Time Series Analysis

Chapter 11 of 18 · 20 min

Time series analysis extracts temporal patterns from sequential data. Pandas provides effective datetime handling and resampling capabilities for this work.

Parsing and Indexing Datetime

# Parse dates during load
df = pd.read_csv('sensor_readings.csv', parse_dates=['timestamp'], index_col='timestamp')

# Or convert existing column
df['datetime'] = pd.to_datetime(df['date_string'], format='%Y-%m-%d %H:%M:%S')
df.set_index('datetime', inplace=True)

# Flexible date range filtering
monthly_data = df['2024-03-01':'2024-06-30']

Resampling for Aggregation

# Aggregate to different frequencies
hourly_avg = df.resample('H').mean()           # Hourly average
daily_max = df.resample('D').max()             # Daily maximum
weekly_sum = df.resample('W').sum()            # Weekly total

# Custom aggregation functions
monthly_stats = df.resample('M').agg({
    'temperature': ['mean', 'std', 'min', 'max'],
    'pressure': 'median'
})

Rolling Windows for Smoothing

Rolling calculations reveal trends while filtering noise:

# 7-day rolling average
df['rolling_avg'] = df['value'].rolling(window='7D', min_periods=1).mean()

# Exponentially weighted moving average (EWMA) - more responsive
df['ewma'] = df['value'].ewm(span=7, adjust=False).mean()

# Rolling standard deviation for volatility
df['volatility'] = df['value'].rolling(window='30D', min_periods=15).std()

Decomposition

Separate time series into trend, seasonal, and residual components:

from statsmodels.tsa.seasonal import seasonal_decompose

# Additive decomposition for consistent seasonality
decomposition = seasonal_decompose(df['sales'], model='additive', period=30)

fig, (ax1, ax2, ax3, ax4) = plt.subplots(4, 1, figsize=(12, 10))
decomposition.observed.plot(ax=ax1, title='Observed')
decomposition.trend.plot(ax=ax2, title='Trend')
decomposition.seasonal.plot(ax=ax3, title='Seasonal')
decomposition.resid.plot(ax=ax4, title='Residual')
plt.tight_layout()
plt.savefig('decomposition.png')

Autocorrelation Function (ACF)

ACF reveals how values correlate with their own past:

from statsmodels.graphics.tsaplots import plot_acf, plot_pacf

fig, (ax1, ax2) = plt.subplots(1, 2, figsize=(14, 4))
plot_acf(df['value'].dropna(), ax=ax1, lags=40)
plot_pacf(df['value'].dropna(), ax=ax2, lags=40)
plt.tight_layout()

Local verification checkpoint

Run the smallest example from this chapter in a local workspace and record the package version, runtime, data path, and observed output. If the result depends on model size, vector count, CPU/GPU backend, or available memory, note that constraint beside the exercise so the lesson remains reproducible.

Local verification checkpoint

Run the smallest example from this chapter in a local workspace and record the package version, runtime, data path, and observed output. If the result depends on model size, vector count, CPU/GPU backend, or available memory, note that constraint beside the exercise so the lesson remains reproducible.

EXERCISE

Load a timestamped dataset, decompose it using seasonal_decompose with period=24, and interpret which components dominate the signal.