Lesson 4 - Reading Cyclepath's ACF and PACF

Welcome to Reading Cyclepath’s ACF and PACF

Every tool this module built is about to get used at once. Lesson 1 defined the ACF and PACF. Lesson 2 gave you the AR and MA signatures on data where the answer was already known. Lesson 3 gave you the significance band and, more importantly, the discipline to distrust an isolated spike. Now: Cyclepath’s actual, seasonally-differenced series from Module 3, with none of the answers known in advance.

By the end of this lesson, you will be able to:

  • Compute and read the full ACF/PACF table for a real stationary series
  • Identify which lags are genuinely significant using the band from Lesson 3
  • Apply the AR/MA signatures from Lesson 2 to distinguish real structure from noise
  • Separate a lag with independent corroboration from one that’s likely spurious

Let’s read the whole table.


The Full Table

import numpy as np, pandas as pd
from statsmodels.tsa.stattools import acf, pacf

def cyclepath():
    idx = pd.date_range("2016-01-01", periods=96, freq="MS")
    t = np.arange(96); rng = np.random.default_rng(42)
    trend = 9000 + 90*t; seasonal = 3200*np.sin(2*np.pi*(t-3)/12); noise = rng.normal(0,350,96)
    return pd.Series(np.round(trend+seasonal+noise).astype(int), index=idx, name="trips")

y = cyclepath()
D1 = y.diff(12).dropna()   # Module 3's stationary series, n=84

a = acf(D1, nlags=15)
p = pacf(D1, nlags=15)
band = 1.96 / np.sqrt(len(D1))
print(round(band, 4))   # 0.2139

for lag in range(1, 16):
    flag = "  <-- significant" if abs(a[lag]) > band or abs(p[lag]) > band else ""
    print(f"{lag:2d}  acf={a[lag]:7.3f}  pacf={p[lag]:7.3f}{flag}")
 1  acf= -0.028  pacf= -0.028
 2  acf= -0.078  pacf= -0.081
 3  acf= -0.087  pacf= -0.095
 4  acf= -0.131  pacf= -0.152
 5  acf=  0.088  pacf=  0.068
 6  acf= -0.058  pacf= -0.094
 7  acf= -0.220  pacf= -0.273  <-- significant
 8  acf=  0.042  pacf=  0.001
 9  acf=  0.056  pacf=  0.030
10  acf=  0.065  pacf= -0.006
11  acf=  0.095  pacf=  0.066
12  acf= -0.417  pacf= -0.483  <-- significant
13  acf=  0.016  pacf= -0.041
14  acf=  0.140  pacf=  0.101
15  acf= -0.028  pacf= -0.147

Out of fifteen lags, exactly two cross the ±0.2139 band on both ACF and PACF: lag 7 and lag 12. Everywhere else — lags 1 through 6, 8 through 11, and 13 through 15 — both functions sit comfortably inside the band. That’s not a boring result; it’s the single most important thing this plot says.

Two stacked bar-chart panels sharing a lag axis from 1 to 15, with a shaded significance band around zero in both. Top panel labeled ACF: nearly all bars are small and inside the band, except a bar at lag 7 dipping just below the band and a much taller bar at lag 12 dipping well below it. Bottom panel labeled PACF: the same pattern, small bars inside the band everywhere except lag 7 and a deep bar at lag 12, with lag 12 the tallest bar in either panel.
Cyclepath's stationary series: flat, insignificant bars almost everywhere, with exactly two exceptions at lag 7 and lag 12 — one of which has an obvious explanation, and one of which doesn't.

What “Mostly Flat” Means

Recall Lesson 2’s signatures: an AR process shows a gradually tailing off ACF; an MA process shows a gradually tailing off PACF. Neither is what this table shows outside of lags 7 and 12 — there’s no smooth decay anywhere, just small values bouncing near zero. That absence of a decay pattern is itself informative: it means the non-seasonal part of this series — everything at short lags, 1 through 6 — looks essentially like white noise. There’s no evidence here for a meaningful non-seasonal AR or MA term; the regular (p, q) for this series is likely (0, 0), or very close to it.

That’s not a disappointing result — it’s actually the expected one. Seasonal differencing (Module 3) was specifically built to strip out exactly the kind of short-lag, non-seasonal structure this table would otherwise show. Finding near-nothing at the short lags is the confirmation that seasonal differencing did its job cleanly on the non-seasonal side.


What’s Left: Lag 7 and Lag 12

Two lags survived. Applying Lesson 3’s discipline to each:

Lag 12 has an obvious, independent story: it’s the seasonal period. Module 3 already found leftover seasonal autocorrelation at exactly this lag (-0.417 in the ACF, computed independently back in that module before this lesson’s PACF was even in the picture) — this lesson’s numbers (-0.417 ACF, -0.483 PACF) simply confirm it again from a slightly different angle. A 12-month bike-share series having something left at the 12-month lag, even after seasonal differencing once, is not surprising; it’s a well-known real-world pattern (the strength of the seasonal effect itself can drift slightly year to year, which a single round of seasonal differencing doesn’t fully erase).

Lag 7 has no such story. Nothing about monthly bike-share ridership gives lag 7 (seven months) any calendar or structural meaning the way lag 12 obviously has. And Lesson 3’s simulation is directly relevant here: testing 15 lags on pure white noise produced at least one spurious “significant” bar 42% of the time. A single isolated spike at an unexplained lag, with fourteen other lags all flat, is exactly the pattern that simulation predicts chance alone will produce fairly often. The responsible read is to treat lag 7 as likely spurious — not to build a model term around it — unless independent evidence (a longer series, a domain reason, consistency across related series) shows up to back it.

Ambiguous at lag 12: AR or MA?

Notice that lag 12 is significant in both the ACF and the PACF, at similar magnitude (-0.417 vs. -0.483) — this doesn’t cleanly match either the “ACF tails off, PACF cuts off” AR signature or the “ACF cuts off, PACF tails off” MA signature from Lesson 2. A single strong spike showing up in both functions at the same lag, with nothing decaying around it, is genuinely ambiguous between a seasonal AR(1) term and a seasonal MA(1) term at lag 12 — both are plausible from this plot alone. That’s not a failure of the method; it’s an honest limit of what ACF/PACF reading alone can resolve, and it’s exactly the kind of tie that Module 6’s SARIMA work will break by fitting both candidates and comparing them with a formal criterion rather than guessing from the plot.


Practice Exercises

Exercise 1: What would a clean AR(2) look like here instead?

If Cyclepath’s non-seasonal structure had actually been AR(2) instead of near-white-noise, what would you expect to see differently in lags 1–6 of this table?

Hint

You’d expect the PACF to show two clear significant spikes (at lags 1 and 2) followed by a drop to near-zero, while the ACF would show a smooth, gradual decay across several lags rather than staying flat near zero throughout. That’s the AR signature from Lesson 2. What the real table shows instead — both functions flat and insignificant across lags 1–6 — rules this out; there’s no gradual ACF decay and no PACF spike pattern to find.

Exercise 2: Why does lag 12 survive even after seasonal differencing?

Module 3’s seasonal differencing was specifically designed to remove the 12-month seasonal pattern. Why does a signal at lag 12 still show up here?

Hint

Seasonal differencing removes a constant seasonal pattern — the assumption that this July looks like every other July in exactly the same way, every year. If the strength or shape of the seasonal effect drifts slightly from year to year (which real seasonal series often do, even Cyclepath’s seeded synthetic one, given the noise term added on top of the fixed seasonal formula), a single round of seasonal differencing won’t erase it completely — some of that year-to-year variability in the seasonal pattern still correlates points 12 months apart. This is a completely normal outcome and precisely the kind of structure a seasonal AR or MA term (SARIMA, Module 6) is built to capture directly, rather than trying to remove it entirely through differencing alone.

Exercise 3: What would change your mind about lag 7?

What kind of additional evidence would make you take a lag-7 spike seriously instead of dismissing it as likely spurious?

Hint

A few things could change the read: if lag 7 showed up consistently across multiple independent bike-share datasets or multiple non-overlapping stretches of a much longer series (reducing the odds it’s a one-off sampling fluke); if there were a substantive reason for a roughly 7-unit cycle in this specific domain (there usually isn’t for monthly data, but there might be for other series — a weekly-then-something pattern, for instance); or if a much larger sample kept showing the same lag-7 magnitude while the significance band (which shrinks with more data) got tighter around it. Absent any of that, a single lag crossing the line in an 84-point series, with no calendar story, stays in the “probably noise” category.


Summary

Cyclepath’s stationary series shows a table dominated by near-zero, insignificant ACF and PACF values — no gradual tailing-off pattern anywhere in lags 1 through 6, which rules out meaningful non-seasonal AR or MA structure and suggests a regular order close to (p, q) = (0, 0). Exactly two lags cross the ±0.2139 significance band: lag 7 (ACF -0.220, PACF -0.273) and lag 12 (ACF -0.417, PACF -0.483). Lag 12 has independent corroboration — it’s the seasonal period, and Module 3 already flagged leftover structure there — while lag 7 has no such backing and, per Lesson 3’s simulation, is exactly the kind of isolated spike pure chance produces about 42% of the time across 15 tested lags. Lag 12 is significant in both functions at similar strength, which is genuinely ambiguous between a seasonal AR(1) and seasonal MA(1) interpretation — a tie the next module’s model fitting will break.

Key Concepts

  • Flat non-seasonal lags — no ACF/PACF structure at short lags confirms seasonal differencing already handled the non-seasonal signal.
  • Corroborated vs. isolated significance — lag 12 (seasonal, independently confirmed) versus lag 7 (isolated, no story, likely spurious).
  • Ambiguity at lag 12 — a spike significant in both ACF and PACF at similar magnitude doesn’t cleanly match either pure signature, leaving AR vs. MA genuinely undetermined from the plot alone.
  • Reading real data vs. synthetic data — unlike Lesson 2’s clean AR(1)/MA(1) examples, real series rarely hand you a textbook-clean signature; the discipline is in separating real signal from noise, not in finding a perfect match.

Why This Matters

This is what ACF/PACF reading actually looks like in practice: mostly flat, with a small number of lags worth paying attention to, and real judgment required to separate the ones with a story from the ones that are probably noise. The conclusion here — essentially no regular structure, a genuine seasonal signal at lag 12, ambiguous between AR and MA — isn’t a dead end; it’s a precise, evidence-backed starting point. The guided project next turns this reading into an actual shortlist of candidate orders, ready to hand to Module 5’s ARIMA fitting.


Next Steps

Continue to Lesson 5 - Guided Project: Choosing ARIMA Orders for Cyclepath

Turn this lesson's reading into a concrete shortlist of candidate orders to test in Module 5.

Back to Module Overview

Return to the Autocorrelation: ACF and PACF module overview


Continue Building Your Skills

You’ve read Cyclepath’s real ACF/PACF table and separated a corroborated seasonal signal (lag 12) from a likely-spurious one (lag 7) — and found essentially no non-seasonal structure to model. Next, the guided project turns that reading into an actual, concrete shortlist of candidate ARIMA orders, ready to hand to Module 5.