Lesson 4 - STL Decomposition
On this page
Welcome to STL Decomposition
Classical decomposition (Lesson 2) is transparent, but it has two real weaknesses: the centered moving average leaves gaps at both edges of the series (you saw six months lost on each side of Cyclepath), and a plain average has no defense against an outlier — one wild data point drags the trend estimate around for months on either side of it. STL (Seasonal-Trend decomposition using LOESS, a local weighted regression smoother) fixes both. It’s the decomposition method most practitioners reach for by default today.
By the end of this lesson, you will be able to:
- Explain what LOESS smoothing does differently from a simple moving average
- Run STL with
statsmodelsand read its trend, seasonal, and residual components - Explain what
robust=Truedoes and when it matters - Demonstrate STL’s outlier resistance against classical decomposition, with real numbers
Let’s see it handle Cyclepath.
Running STL
import numpy as np, pandas as pd
from statsmodels.tsa.seasonal import STL
def cyclepath():
idx = pd.date_range("2016-01-01", periods=96, freq="MS")
t = np.arange(96); rng = np.random.default_rng(42)
trend = 9000 + 90*t; seasonal = 3200*np.sin(2*np.pi*(t-3)/12); noise = rng.normal(0,350,96)
return pd.Series(np.round(trend+seasonal+noise).astype(int), index=idx, name="trips")
y = cyclepath()
stl = STL(y, period=12, robust=True).fit()
print(stl.trend.iloc[11:14].round(1).tolist()) # [9960.8, 10057.4, 10154.1]
print(stl.seasonal.iloc[:12].round(1).tolist()) # peak/trough shape, similar to classical
print(round(stl.resid.std(), 2)) # 215.25
print(stl.trend.notna().sum()) # 96Two immediate differences from Lesson 2’s classical decomposition jump out. First, stl.trend has all 96 values, no gaps — LOESS can estimate a trend all the way to both edges of the series, unlike the centered moving average that left the first and last six months as NaN. Second, the residual is slightly smaller: STL’s std of 215.25 versus classical’s 229.20 — STL’s smoother trend and seasonal estimates leave a bit less unexplained noise behind. The seasonal shape itself (peaking mid-year, troughing in winter) is recognizably the same pattern as the classical decomposition, just estimated with a different smoothing technique.
What LOESS actually does
LOESS (locally estimated scatterplot smoothing) fits a small weighted regression around each point using only its nearby neighbors, weighting closer points more heavily — instead of one global moving-average window sliding mechanically across the whole series. That locality is what lets STL cover every point (there’s always a local neighborhood to fit, even at the edges) and what lets its robust mode down-weight points that don’t fit their local neighborhood well, rather than letting them corrupt an unweighted average.
The Real Test: An Outlier
The clearest demonstration of STL’s advantage is what happens when one data point goes haywire. Inject a single artificial spike into Cyclepath — May 2019 (originally 14,460 trips) gets 8,000 trips added, as if a data-entry error or a one-off event inflated a single month — and compare how much each method’s trend estimate shifts near the spike, versus its own trend on the clean series:
from statsmodels.tsa.seasonal import seasonal_decompose
y_out = y.copy()
y_out.iloc[40] = y_out.iloc[40] + 8000 # 2019-05: 14,460 -> 22,460
add_base = seasonal_decompose(y, model="additive", period=12)
add_out = seasonal_decompose(y_out, model="additive", period=12)
stl_base = STL(y, period=12, robust=True).fit()
stl_out = STL(y_out, period=12, robust=True).fit()
dist_classical = (add_out.trend - add_base.trend).abs().max()
dist_stl = (stl_out.trend - stl_base.trend).abs().max()
print(round(dist_classical, 1)) # 666.7
print(round(dist_stl, 1)) # 17.8One bad month drags the classical trend estimate off by as much as 666.7 trips at nearby points — the moving average has no way to tell “real signal” from “one outlier” and just averages the spike in, distorting five months on either side of it. The robust STL trend, by contrast, shifts by at most 17.8 — nearly 37 times less — because its robust weighting recognizes May 2019 as inconsistent with its neighbors and down-weights it almost entirely rather than blending it into the trend.
This is not a small-sample fluke — it’s the direct, structural consequence of how each method smooths. A moving average treats every point in its window equally; one huge value pulls the average toward it. LOESS in robust mode iterates: fit, measure how far each point falls from the fit, then re-fit with badly-fitting points down-weighted. An outlier gets identified and muted instead of blindly averaged in.
robust=True vs robust=False
That resistance is specifically the robust=True behavior — turn it off and STL loses most of its advantage on this exact test:
stl_out_plain = STL(y_out, period=12, robust=False).fit()
dist_stl_plain = (stl_out_plain.trend - stl_base.trend).abs().max()
print(round(dist_stl_plain, 1)) # 604.5Without robust=True, STL’s trend distortion from the same outlier jumps to 604.5 — nearly back to the classical method’s 666.7. Robust mode is what does the outlier-resistance work; STL’s edge-coverage advantage (all 96 points, no NaN gaps) holds either way, but the outlier resistance is opt-in. Unless you have a specific reason to disable it (it costs a bit more computation, since it iterates), robust=True is the safer default for real data, which rarely arrives outlier-free.
Practice Exercises
Exercise 1: Why no gaps?
Classical decomposition loses six months at each edge of Cyclepath; STL loses none. Why does a local smoother like LOESS not have this problem?
Hint
The classical moving average needs a full, symmetric window of neighbors on both sides to compute a centered average — at the very first and last months, there simply aren’t enough neighbors on one side, so the calculation is undefined (NaN). LOESS fits a local regression using whatever neighbors are available near each point, asymmetric window and all, so it can still produce an estimate at the very first and last observations — just with a bit less local information to work with than points in the middle of the series.
Exercise 2: When would robust=False be fine?
Robust mode costs more computation because it iterates. When might it be safe to skip it?
Hint
If you’ve already inspected the series and you’re confident it has no outliers or data errors — for instance, a carefully validated internal metric with a known, clean pipeline — the iterative down-weighting has nothing to correct for, so robust=False saves computation without changing the result much. But for any series you haven’t personally vetted (which is most real-world data), the cost of leaving robust mode on is small and the protection it buys, as this lesson just measured, is large. When in doubt, leave it on.
Exercise 3: STL versus the swing-to-level test
Lesson 3 tested additive vs multiplicative using the classical decomposition’s building blocks (swing and level by year). Does switching to STL change that conclusion for Cyclepath?
Hint
No — the swing-to-level test in Lesson 3 was computed directly from the raw series (y.groupby(y.index.year)), not from either decomposition’s internals, so it doesn’t depend on which smoothing method you use downstream. STL and classical decomposition are two different ways to extract trend and seasonal components once you’ve already decided additive is the right combination rule; STL doesn’t have its own separate additive/multiplicative test, so that decision from Lesson 3 carries forward unchanged regardless of which decomposition method you use next.
Summary
STL (Seasonal-Trend decomposition using LOESS) improves on classical decomposition in two concrete ways, both demonstrated with real numbers on Cyclepath. It covers every point — stl.trend has all 96 values with no NaN edges, unlike classical decomposition’s six-month gap on each side. And in robust mode, it resists outliers: injecting a single 8,000-trip spike distorted the classical trend by up to 666.7 nearby, but STL’s robust trend by only 17.8 — a 37x difference — because LOESS’s local, iteratively-reweighted fitting recognizes and down-weights points that don’t match their neighborhood, instead of blending them into a simple average. Turning off robust=True gives up most of that protection (distortion jumps to 604.5, nearly matching classical). STL doesn’t replace Lesson 3’s additive/multiplicative test — that’s decided from the raw series either way — it’s simply a better tool for extracting the trend and seasonal components once that choice is made.
Key Concepts
- LOESS — local weighted regression smoothing; estimates each point from its nearby neighbors rather than one global window.
- No edge gaps — STL produces a trend estimate for every point in the series, unlike a centered moving average.
- robust=True — iteratively down-weights points that don’t fit their local neighborhood, resisting outlier distortion.
- STL vs. classical — a better extraction method for the same trend/seasonal/residual components; doesn’t change the additive/multiplicative decision.
Why This Matters
Real data is rarely as clean as a seeded synthetic series — sensor glitches, one-off events, and data-entry errors happen, and a decomposition that lets one bad point drag the trend around for months will feed that distortion straight into anything built on top of it: a misjudged trend, a shifted stationarity test, a SARIMA fit anchored to a corrupted baseline. STL’s robust mode is cheap insurance against exactly that failure mode, which is why it’s the default reach for most practitioners rather than a specialty tool. With trend, seasonality, residual, the additive/multiplicative choice, and now a robust extraction method all in hand, the module capstone puts everything together on the full Cyclepath series.
Next Steps
Continue to Lesson 5 - Guided Project: Decomposing Cyclepath
Apply classical decomposition, the additive/multiplicative test, and STL together on the full series, and interpret every component.
Back to Module Overview
Return to the Components and Decomposition module overview
Continue Building Your Skills
You’ve now built a decomposition by hand, tested additive versus multiplicative with real evidence, and seen STL’s robust mode shrug off an outlier that badly distorted the classical trend. The module capstone brings all three together: decompose the full Cyclepath series, interpret what each component reveals, and set up the stationarity question Module 3 tackles next.