Drag & drop or click to add images or PDF
A partial derivative measures how a multivariable function changes with respect to one variable while holding the others fixed. For :
The notation (curly d) distinguishes partial derivatives from ordinary derivatives . Equivalent notations include , , .
Geometric meaning: is the slope of the surface at in the -direction — the tangent line lies in the plane .
Why this matters: gradient descent, optimization, error propagation, and most of vector calculus rest on partial derivatives. The gradient points in the direction of steepest ascent.
To find , treat as constants and differentiate as a single-variable function of .
Example:
For :
The inside the parenthesis is treated as a constant coefficient when differentiating with respect to .
Clairaut's theorem (mixed partials): if has continuous second partials, then . Order of differentiation doesn't matter.
The gradient is the vector of all first partials:
The directional derivative in direction (unit vector) is:
Maximized when points along — this is the steepest ascent direction.
If and :
An ordinary derivative df/dx applies to single-variable functions. A partial derivative ∂f/∂x applies to multivariable functions and measures the rate of change with respect to one variable while holding the others fixed.
If a function f(x,y) has continuous second-order partial derivatives, then the mixed partials are equal: f_xy = f_yx. The order of differentiation doesn't matter in that case.
The gradient is a vector pointing in the direction of the steepest ascent of f at a point. Its magnitude is the maximum rate of change at that point. It's also perpendicular to level curves and level surfaces of f.
Gradient descent uses the gradient (vector of partials) of the loss function with respect to model parameters. The algorithm updates parameters in the negative gradient direction to minimize loss.
Get step-by-step solutions to any math problem. Upload a photo or type your question.
Start Solving