Linear regression finds the straight line that best fits a set of data points. "Best" is defined by the least squares criterion: minimising the sum of squared vertical distances between the line and the points.
The slope and intercept have closed-form solutions:
The coefficient of determination measures fit quality (between 0 and 1; closer to 1 = better fit).
Linear regression is the simplest predictive model and the foundation of more sophisticated methods:
- Multiple regression uses several inputs.
- Logistic regression adapts the idea for binary outcomes.
- Ridge / Lasso add regularisation.
- Modern machine learning's "linear models" are direct descendants.
Despite its simplicity, linear regression remains heavily used in finance (CAPM), epidemiology, economics, and as a baseline against which fancier models must justify their complexity.