The gradient of is the vector of all partial derivatives: .
Geometric interpretation: at any point, points in the direction of steepest ascent, with magnitude equal to the rate of change in that direction.
To find local maxima/minima, set and check second-order conditions. To minimise (e.g. ML loss), walk in direction — this is gradient descent, the backbone of modern machine learning. Variants (momentum, Adam, RMSprop) all build on this idea.
The gradient is perpendicular to level curves of the function. The directional derivative in direction (unit vector) is .