The chi-square () test is the standard tool for categorical data. The test statistic:
where are observed counts and are expected under .
Three common variants:
- Goodness-of-fit: does observed distribution match a theoretical one? (Is a die fair?). .
- Independence: are two categorical variables independent? (Is gender independent of voting preference?). for contingency tables.
- Variance test: less common.
Assumption: expected counts must be sufficiently large (typically in each cell). For small samples, use Fisher's exact test instead.
The chi-square distribution itself is the distribution of a sum of squared standard normals — used to construct critical values.