THE FORGE • Mixed Effects

The Data

Slope Estimates

What's Happening

Simpson's Paradox

A trend that appears in aggregated data can reverse when the data is split into groups. This is not rare. It happens whenever a confounding variable creates different subpopulations.

Why It Matters

If your data has group structure and you ignore it, your model can give you the wrong sign on a coefficient. The overall slope and the within-group slopes can point in opposite directions.

Overall: \(\hat{\beta}_1 < 0\), but within every group: \(\hat{\beta}_{1j} > 0\)

The Fix

Account for the group structure in your model. The next tabs explore three ways to do this: ignore groups entirely (complete pooling), treat each group independently (no pooling), or model groups as drawn from a distribution (partial pooling).

Explore More

Link Functions GLM foundations this app builds on

What happens when your data has groups you can't see? Fit a line and check the slope. Then reveal the groups.

Scenario

Number of Groups

Obs per Group

Reveal Group Structure

Model Comparison

Complete Pooling

No Pooling (Fixed Effects)

Complete Pooling

Fit one model to all data, ignoring groups entirely.

\(y_i = \beta_0 + \beta_1 x_i + \varepsilon_i\)

One intercept, one slope. If groups differ, this misses that variation entirely. The model acts as if all observations come from the same population.

No Pooling (Fixed Effects for Group)

Fit a separate model per group, or include group as a factor with dummy variables.

\(y_{ij} = \beta_{0j} + \beta_1 x_{ij} + \varepsilon_{ij}\)

Each group gets its own intercept. But small groups have very few data points. Their estimates are noisy and unreliable.

The Tradeoff

Complete pooling has too much bias (ignores group differences). No pooling has too much variance (small-group estimates are wild). This is the bias-variance tradeoff applied to group estimation.

Bias-Variance Tradeoff the same tension, seen differently

Two extremes: ignore groups entirely (complete pooling) or fit each group separately (no pooling). Watch what goes wrong with each.

Number of Groups

Group Sizes

Balanced (n=30)

Imbalanced

Between-Group Spread (σ₀)

Within-Group Noise (σₑ)

True Slope (β₁)

Show

Complete Pooling

No Pooling

Both

Fit Lines

Shrinkage Map

The Mixed Model

A mixed model treats group intercepts as random draws from a shared distribution rather than as fixed parameters.

\(y_{ij} = (\beta_0 + b_{0j}) + \beta_1 x_{ij} + \varepsilon_{ij}\)

\(b_{0j} \sim N(0, \sigma^2_{b_0})\), \quad \varepsilon_{ij} \sim N(0, \sigma^2_e)\)

Shrinkage

The group-level estimate is a weighted average of the group's own data and the overall mean. The weight depends on group size and variance components.

\(\hat{b}_{0j} \approx \frac{n_j / \sigma^2_e} {n_j / \sigma^2_e + 1 / \sigma^2_{b_0}} \cdot (\bar{y}_j - \hat{\beta}_0)\)

When \(n_j\) is small, the group estimate shrinks heavily toward the grand mean. When \(n_j\) is large, the group keeps its own estimate. This is partial pooling.

Why It Works

Shrinkage reduces variance without adding much bias. The net effect: lower MSE, especially for small groups. The model borrows strength from the other groups.

Bias-Variance Tradeoff shrinkage as variance reduction

Partial pooling (the mixed model) compromises between the two extremes. Small groups get pulled toward the grand mean. Large groups keep their own estimate.

Number of Groups

Group Sizes

Balanced (n=30)

Imbalanced

Between-Group Spread (σ₀)

Within-Group Noise (σₑ)

Overlay

Complete Pooling

No Pooling

Partial Pooling

Spaghetti Plot

Random Effects Scatter

Variance Components

Random Intercepts + Random Slopes

\(y_{ij} = (\beta_0 + b_{0j}) + (\beta_1 + b_{1j}) x_{ij} + \varepsilon_{ij}\)

\(\begin{pmatrix} b_{0j} \\ b_{1j} \end{pmatrix} \sim N\left( \begin{pmatrix} 0 \\ 0 \end{pmatrix}, \begin{pmatrix} \sigma^2_{b_0} & \rho \sigma_{b_0} \sigma_{b_1} \\ \rho \sigma_{b_0} \sigma_{b_1} & \sigma^2_{b_1} \end{pmatrix} \right)\)

Why Correlated Random Effects?

Groups that start high might change more slowly (negative correlation) or more quickly (positive correlation). The covariance structure captures this pattern.

Model Selection in R

Random intercepts: lmer(y ~ x + (1 | group))

Random slopes: lmer(y ~ x + (0 + x | group))

Both: lmer(y ~ x + (1 + x | group))

When Do You Need Random Slopes?

If the relationship between x and y truly varies across groups, a random-intercepts-only model underfits. Check whether the no-pooling slopes look different across groups. If they do, add random slopes.

So far, groups only differed in their intercepts. But what if each group has a different slope too? Random slopes model that variation.

Number of Groups

Obs per Group

Intercept Spread (σ₀)

Slope Spread (σ₁)

Intercept-Slope Correlation (ρ)

Noise (σₑ)

Model

Random Intercepts Only

Random Slopes Only

Both (Full Mixed)

Show No-Pooling Lines

The Sandbox

Coefficient Summary

Shrinkage Map

GLMMs: Mixed Effects + Link Functions

A Generalized Linear Mixed Model (GLMM) combines the GLM framework (link functions, exponential families) with random effects.

\(g(\mu_{ij}) = (\beta_0 + b_{0j}) + (\beta_1 + b_{1j}) x_{ij}\)

Gaussian

Identity link: \(\mu_{ij} = \eta_{ij}\)
Fitted with lmer()

Binomial

Logit link: \(\log\frac{p_{ij}}{1-p_{ij}} = \eta_{ij}\)
Fitted with glmer(..., family=binomial)

Poisson

Log link: \(\log(\mu_{ij}) = \eta_{ij}\)
Fitted with glmer(..., family=poisson)

Explore More

Link Functions deep dive on link functions and GLMs

Full control. Pick the response type, set the group structure, and see how complete pooling, no pooling, and partial pooling compare.

Response Type

Number of Groups

Group Sizes

Balanced (n=40)

Imbalanced

Fixed Intercept (β₀)

Fixed Slope (β₁)

Intercept Spread (σ₀)

Slope Spread (σ₁)

Noise (σₑ)

Show Models

Complete Pooling

No Pooling

Partial Pooling

◆ THE BLUEPRINT

Simpson's Paradox

Why It Matters

The Fix

Explore More

◆ THE BLUEPRINT

Complete Pooling

No Pooling (Fixed Effects for Group)

The Tradeoff

◆ THE BLUEPRINT

The Mixed Model

Shrinkage

Why It Works

◆ THE BLUEPRINT

Random Intercepts + Random Slopes

Why Correlated Random Effects?

Model Selection in R

When Do You Need Random Slopes?

◆ THE BLUEPRINT

GLMMs: Mixed Effects + Link Functions

Gaussian

Binomial

Poisson

Explore More