Distribution & Critical Region
R Code
Step-by-Step
Quick Reference Table
◆ THE BLUEPRINT
What is a Critical Value?

The critical value is the number of standard errors from the center of the sampling distribution needed to capture the desired confidence level. It is the multiplier \(M\) in the formula:

$$\text{CI} = \text{Estimate} \pm \underbrace{M}_{\text{critical value}} \times SE$$
Why \(\alpha/2\)?

A two-sided confidence interval splits the remaining probability equally between two tails. If the confidence level is 95%, then 5% total is outside the interval: 2.5% in each tail. The R functions use \(1 - \alpha/2\) as the quantile.

Z vs. t

Use \(z^*\) for proportions, where the SE formula does not involve \(\sigma\). Use \(t^*\) when estimating a mean with an unknown \(\sigma\). The t-distribution has heavier tails, so \(t^*\) is always larger than \(z^*\) for the same confidence level.

When does t approach z?

As degrees of freedom increase, the t-distribution approaches the standard normal. By df = 30, the difference is small. By df = 120, they are nearly identical. The common Z critical values (1.645, 1.960, 2.576) are the limits as df approaches infinity.

Sampling Distribution
Coverage Simulation
Results
Confidence Interval
Formulas & Computations
◆ THE BLUEPRINT
One-Proportion Z Interval

Estimates a population proportion \(\pi\) using the observed sample proportion \(\hat{p}\).

Formula
$$\hat{p} \pm z^* \sqrt{\frac{\hat{p}(1-\hat{p})}{n}}$$
Why use \(\hat{p}\) in the SE?

Unlike hypothesis testing (which uses \(\pi_0\) in the standard error), confidence intervals have no null value. We estimate the SE using the observed \(\hat{p}\).

Validity Conditions

The normal approximation requires both \(n\hat{p} \geq 10\) and \(n(1-\hat{p}) \geq 10\).

Interpreting the Interval

We are \(C\%\) confident that the true population proportion lies within the interval. This means that if we repeated the sampling process many times, about \(C\%\) of the resulting intervals would contain the true parameter.

Coverage Simulation

Each simulated sample draws \(k^* \sim \text{Binomial}(n, \hat{p})\) and computes a new CI. The coverage rate is the fraction of simulated intervals that contain the true value (set to the observed \(\hat{p}\)).

Duality with Hypothesis Testing

A \(C\%\) confidence interval contains all values of \(\pi_0\) that would not be rejected at the \(\alpha = 1 - C\) significance level.

Sampling Distribution
Coverage Simulation
Results
Confidence Interval
Formulas & Computations
◆ THE BLUEPRINT
One-Sample t Interval

Estimates a population mean \(\mu\) when the population standard deviation is unknown.

Formula
$$\bar{x} \pm t^* \frac{s}{\sqrt{n}} \qquad df = n - 1$$
Why t instead of z?

When \(\sigma\) is unknown and estimated by \(s\), the extra uncertainty is captured by the t-distribution. The t-distribution has heavier tails than the normal, so the interval is wider. As \(n\) grows, \(t^*\) converges to \(z^*\).

Validity Conditions

The t-interval requires approximate normality or a large enough sample. Tintle et al. use \(n \geq 20\). The traditional CLT threshold is \(n \geq 30\).

Coverage Simulation

Each simulated sample draws \(n\) observations from N(\(\bar{x}\), \(s\)), computes the sample mean, sample SD, and constructs a new t-interval. The coverage rate is the fraction of intervals that contain the true mean (set to the observed \(\bar{x}\)).

Duality with Hypothesis Testing

A \(C\%\) confidence interval contains all values of \(\mu_0\) that would not be rejected by a two-sided t-test at the \(\alpha = 1 - C\) level.

Sampling Distribution
Coverage Simulation
Results
Confidence Interval
Formulas & Computations
◆ THE BLUEPRINT
Two-Proportion Z Interval

Estimates the difference \(\pi_1 - \pi_2\) between two population proportions.

Formula
$$(\hat{p}_1 - \hat{p}_2) \pm z^* \sqrt{\frac{\hat{p}_1(1-\hat{p}_1)}{n_1} + \frac{\hat{p}_2(1-\hat{p}_2)}{n_2}}$$
No Pooling for CI

Unlike the two-proportion hypothesis test (which pools under H\(_0\): \(\pi_1 = \pi_2\)), the confidence interval uses each group's own \(\hat{p}\) in the standard error. There is no null hypothesis to assume equal proportions.

Validity Conditions

All four of the following must be \(\geq 10\): \(n_1\hat{p}_1\), \(n_1(1-\hat{p}_1)\), \(n_2\hat{p}_2\), \(n_2(1-\hat{p}_2)\).

Contains Zero?

If the interval contains 0, there is no significant difference between the two proportions at the given confidence level. This is equivalent to failing to reject H\(_0\): \(\pi_1 = \pi_2\) in a two-sided test.

Coverage Simulation

Each simulation draws \(k_1^* \sim \text{Binomial}(n_1, \hat{p}_1)\) and \(k_2^* \sim \text{Binomial}(n_2, \hat{p}_2)\), then constructs a CI for the difference. The coverage rate is the fraction of intervals that contain the true difference.

Sampling Distribution
Coverage Simulation
Results
Confidence Interval
Formulas & Computations
◆ THE BLUEPRINT
Welch Two-Sample t Interval

Estimates the difference \(\mu_1 - \mu_2\) between two population means. Welch's version does not assume equal variances.

Formula
$$(\bar{x}_1 - \bar{x}_2) \pm t^* \sqrt{\frac{s_1^2}{n_1} + \frac{s_2^2}{n_2}}$$
Welch Degrees of Freedom
$$df = \frac{\left(\frac{s_1^2}{n_1} + \frac{s_2^2}{n_2}\right)^2}{\frac{\left(s_1^2/n_1\right)^2}{n_1 - 1} + \frac{\left(s_2^2/n_2\right)^2}{n_2 - 1}}$$
Welch vs. Pooled

The pooled t-interval assumes \(\sigma_1 = \sigma_2\). Welch's interval relaxes this assumption and is the safer default. R's t.test() uses Welch by default.

Validity Conditions

Each group needs approximate normality or a large enough sample. Tintle et al. use \(n \geq 20\) per group. The traditional CLT threshold is \(n \geq 30\) per group.

Coverage Simulation

Each simulation draws two independent samples from N(\(\bar{x}_1\), \(s_1\)) and N(\(\bar{x}_2\), \(s_2\)). For each pair, we compute the Welch t-interval with its own df. The coverage rate is the fraction of intervals that contain the true difference.

Contains Zero?

If the interval contains 0, there is no significant difference between the means at the given confidence level.

Sampling Distribution
Coverage Simulation
Results
Confidence Interval
Formulas & Computations
◆ THE BLUEPRINT
Paired t Interval

Estimates the mean difference \(\mu_d\) between paired observations. This reduces to a one-sample t-interval on the differences.

Formula
$$\bar{d} \pm t^* \frac{s_d}{\sqrt{n}} \qquad df = n - 1$$

where \(\bar{d}\) is the mean of the differences and \(s_d\) is the standard deviation of the differences.

Why Pairing Helps

Pairing removes between-subject variability. Instead of comparing two independent groups, we analyze within-subject differences. This often reduces the standard error and produces a narrower interval.

When to Use Paired vs. Two-Sample

Use the paired interval when observations come in natural pairs: before/after measurements on the same subject, matched subjects, or repeated measures. Use the two-sample interval when the groups are independent.

Validity Conditions

The differences need approximate normality or a large enough sample. Tintle et al. use \(n \geq 20\) pairs. The traditional CLT threshold is \(n \geq 30\).

Contains Zero?

If the interval contains 0, there is no significant mean difference at the given confidence level. This is equivalent to failing to reject H\(_0\): \(\mu_d = 0\) in a two-sided paired t-test.

Coverage Simulation

Each simulated sample draws \(n\) differences from N(\(\bar{d}\), \(s_d\)) and constructs a new t-interval. The coverage rate is the fraction of intervals that contain the true mean difference.