statistics

Hypothesis Testing Step-by-Step: From H0 to p-value

A practical guide to hypothesis testing — defining H0 and H1, picking the right test, computing the test statistic, and interpreting the p-value without misuse.

本文中文版本即将上线。下方暂以英文原文展示。

AI-Math Editorial Team

作者: AI-Math Editorial Team

发布于 2026-05-01

Hypothesis testing is the workhorse of statistical inference, used everywhere from clinical trials to A/B tests on websites. Yet it is also the most misunderstood topic in statistics. This guide walks through the full pipeline once — clearly — so you understand what a p-value really means.

The five steps

  1. State H0H_0 and H1H_1: the null hypothesis (status quo) and alternative (the claim you want to support).
  2. Pick a significance level α\alpha: usually 0.05 or 0.01.
  3. Compute the test statistic from your data (zz, tt, χ2\chi^2, etc.).
  4. Find the p-value: the probability of seeing data this extreme if H0H_0 were true.
  5. Decide: if p<αp < \alpha, reject H0H_0; otherwise fail to reject.

Note: "fail to reject" ≠ "accept H0H_0". You merely don't have enough evidence against it.

One-sample z-test (worked example)

A factory claims its bulbs last 1000 hours on average (σ=50\sigma = 50). You test 25 bulbs and measure xˉ=980\bar x = 980. Is the claim refuted at α=0.05\alpha = 0.05?

  1. H0:μ=1000H_0: \mu = 1000, H1:μ1000H_1: \mu \ne 1000.
  2. α=0.05\alpha = 0.05, two-tailed.
  3. Test statistic: z=xˉμ0σ/n=980100050/25=2010=2z = \frac{\bar x - \mu_0}{\sigma / \sqrt{n}} = \frac{980 - 1000}{50/\sqrt{25}} = \frac{-20}{10} = -2.
  4. p-value: 2P(Z<2)20.0228=0.04562 \cdot P(Z < -2) \approx 2 \cdot 0.0228 = 0.0456.
  5. Since 0.0456<0.050.0456 < 0.05, reject H0H_0. The mean lifetime is significantly different from 1000 hours.

Picking the right test

SituationTest
One mean, σ\sigma knownone-sample z-test
One mean, σ\sigma unknown, n smallone-sample t-test
Two means, independent samplestwo-sample t-test
Two paired meanspaired t-test
Proportion(s)z-test for proportion
Goodness of fit / contingencychi-square

Type I vs Type II error

  • Type I: rejecting a true H0H_0. Probability = α\alpha.
  • Type II: failing to reject a false H0H_0. Probability = β\beta.
  • Power = 1β1 - \beta: probability of correctly detecting a real effect.

These three move together: shrinking α\alpha raises β\beta for fixed sample size; raising sample size lowers both.

Common mistakes

  • "p-value = probability H0H_0 is true" — false. p-value is P(dataH0)P(\text{data} \mid H_0), not P(H0data)P(H_0 \mid \text{data}).
  • Multiple comparisons — running 20 tests at α=0.05\alpha = 0.05 guarantees ≈1 false positive on average. Use a correction.
  • Conflating significance with importance — a tiny effect with huge nn can be highly significant yet practically irrelevant.

Try with the AI Hypothesis Test Solver

Use the Hypothesis Test Solver to plug in your data and get the test statistic, p-value, and decision.

Related references:

AI-Math Editorial Team

作者: AI-Math Editorial Team

发布于 2026-05-01

A small team of engineers, mathematicians, and educators behind AI-Math, focused on making step-by-step math help accessible to every student.