Chain rule
From Trust
Template:CalculusIn calculus, the chain rule is a formula for the derivative of the composite of two functions.
In intuitive terms, if a variable, y, depends on a second variable, u, which in turn depends on a third variable, x, then the rate of change of y with respect to x can be computed as the product of the rate of change of y with respect to u multiplied by the rate of change of u with respect to x.
Contents |
[edit] Definition
The chain rule states that
:<math> (f \circ g)'(x) = (f(g(x)))' = f'(g(x)) g'(x),\,</math>
which in short form is written as <math> (f \circ g)' = f'\circ g\cdot g'</math>.
Alternatively, in the Leibniz notation, the chain rule is
:<math>\frac {df}{dx} = \frac {df} {dg} \cdot \frac {dg}{dx}.</math>
In integration, the counterpart to the chain rule is the substitution rule.
[edit] Examples
[edit] Example I
Suppose, that one is climbing a mountain at a rate of 0.5 kilometres per hour. The temperature is lower at higher elevations; suppose the rate by which it decreases is 6 °C per kilometre. If one multiplies 6 °C per kilometre by 0.5 kilometre per hour, one obtains 3 °C per hour. This calculation is a typical chain rule application.
[edit] Example II
Consider <math>f(x) = (x^2 + 1)^3</math>. We have <math>f(x)=h(g(x))</math> where <math>g(x) = x^2 + 1</math> and <math>h(x) = x^3.</math> Thus,
:{| |- |<math>f '(x) \,</math>|<math>= 3(x^2 + 1)^2(2x) \,</math>|- | |<math>= 6x(x^2 + 1)^2. \,</math>|}
In order to differentiate the trigonometric function :<math>f(x) = \sin(x^2),\,</math>one can write <math>f(x) = h(g(x))</math> with <math>h(x) = \sin x</math> and <math>g(x) = x^2</math>. The chain rule then yields :<math>f'(x) = 2x \cos(x^2) \,</math>since <math>h'(g(x)) = \cos (x^2)</math> and <math>g'(x) = 2x</math>.
[edit] Example III
Differentiate <math>\arctan\,\sin\, x</math>, etc.
:<math>\frac{d}{dx}\arctan\,x\,=\,\frac{1}{1+x^2}</math>
:<math>\frac{d}{dx}\arctan\,f(x)\,=\,\frac{f'(x)}{1+f^2(x)}</math>
:<math>\frac{d}{dx}\arctan\,\sin\,x\,=\,\frac{\cos\,x}{1+\sin^2\,x}</math>
[edit] Chain rule for several variables
The chain rule works for functions of more than one variable. Consider the function <math>z = f(x,y)</math> where <math>x = g(t)</math> and <math>y = h(t)</math>, then :<math>{\partial z \over \partial t}={\partial f \over \partial x}{dx \over dt}+{\partial f \over \partial y}{dy \over dt}</math>
Suppose that each function of <math>z = f(u,v)</math> is a two-variable function such that <math>u = h(x,y)</math> and <math>v = g(x,y)</math>, and that these functions are all differentiable. Then the chain rule would look like: :<math>{\partial z \over \partial x}={\partial z \over \partial u}{\partial u \over \partial x}+{\partial z \over \partial v}{\partial v \over \partial x}</math>
:<math>{\partial z \over \partial y}={\partial z \over \partial u}{\partial u \over \partial y}+{\partial z \over \partial v}{\partial v \over \partial y}</math>
If we considered <math>\vec r = (u,v)</math> above as a vector function, we can use vector notation to write the above equivalently as the dot product of the gradient of f and a derivative of <math>\vec r</math>: :<math>\frac{\partial f}{\partial x}=\vec \nabla f \cdot \frac{\partial \vec r}{\partial x}</math>
More generally, for functions of vectors to vectors, the chain rule says that the Jacobian matrix of a composite function is the product of the Jacobian matrices of the two functions: :<math>\frac{\partial(z_1,\ldots,z_m)}{\partial(x_1,\ldots,x_p)} = \frac{\partial(z_1,\ldots,z_m)}{\partial(y_1,\ldots,y_n)} \frac{\partial(y_1,\ldots,y_n)}{\partial(x_1,\ldots,x_p)}</math>
[edit] Proof of the chain rule
Let f and g be functions and let x be a number such that f is differentiable at g(x) and g is differentiable at x. Then by the definition of differentiability,
:<math> g(x+\delta)-g(x)= \delta g'(x) + \epsilon(\delta)\delta \,</math> where <math> \epsilon(\delta) \to 0 \,</math> as <math>\delta\to 0.</math>
Similarly, :<math> f(g(x)+\alpha) - f(g(x)) = \alpha f'(g(x)) + \eta(\alpha)\alpha \,</math> where <math>\eta(\alpha) \to 0 \,</math> as <math>\alpha\to 0. \,</math>
Now
:{| |- |<math> f(g(x+\delta))-f(g(x))\, </math>|<math>= f(g(x) + \delta g'(x)+\epsilon(\delta)\delta) - f(g(x)) \,</math>|- | |<math> = \alpha_\delta f'(g(x)) + \eta(\alpha_\delta)\alpha_\delta \,</math>|}
where <math>\alpha_\delta = \delta g'(x) + \epsilon(\delta)\delta \,</math>. Observe that as <math>\delta\to 0,</math> <math>\frac{\alpha_\delta}{\delta}\to g'(x)</math> and <math>\alpha_\delta \to 0</math>, thus <math>\eta(\alpha_\delta)\to 0</math>. Therefore :<math> \frac{f(g(x+\delta))-f(g(x))}{\delta} \to g'(x)f'(g(x))\mbox{ as } \delta \to 0.</math>
[edit] The fundamental chain rule
The chain rule is a fundamental property of all definitions of derivative and is therefore valid in much more general contexts. For instance, if E, F and G are Banach spaces (which includes Euclidean space) and f : E → F and g : F → G are functions, and if x is an element of E such that f is differentiable at x and g is differentiable at f(x), then the derivative (the Fréchet derivative) of the composition g o f at the point x is given by
:<math>\mbox{D}_x\left(g \circ f\right) = \mbox{D}_{f\left(x\right)}\left(g\right) \circ \mbox{D}_x\left(f\right).</math>
Note that the derivatives here are linear maps and not numbers. If the linear maps are represented as matrices (namely Jacobians), the composition on the right hand side turns into a matrix multiplication.
A particularly clear formulation of the chain rule can be achieved in the most general setting: let M, N and P be Ck manifolds (or even Banach-manifolds) and let
:f : M → N and g : N → P
be differentiable maps. The derivative of f, denoted by df, is then a map from the tangent bundle of M to the tangent bundle of N, and we may write
:<math>\mbox{d}\left(g \circ f\right) = \mbox{d}g \circ \mbox{d}f.</math>
In this way, the formation of derivatives and tangent bundles is seen as a functor on the category of C∞ manifolds with C∞ maps as morphisms.
[edit] Tensors and the chain rule
See tensor field for an advanced explanation of the fundamental role the chain rule plays in the geometric nature of tensors.
[edit] Higher derivatives
Faà di Bruno's formula generalizes the chain rule to higher derivatives. The first few derivatives are :<math>\frac{df}{dx} = \frac{df}{dg}\frac{dg}{dx}</math>:<math>
\frac{d^2 f}{d x^2}
= \frac{d^2 f}{d g^2}\left(\frac{dg}{dx}\right)^2
+ \frac{df}{dg}\frac{d^2 g}{dx^2}
</math>:<math>
\frac{d^3 f}{d x^3}
= \frac{d^3 f}{d g^3} \left(\frac{dg}{dx}\right)^3
+ 3 \frac{d^2 f}{d g^2} \frac{dg}{dx} \frac{d^2 g}{d x^2}
+ \frac{df}{dg} \frac{d^3 g}{d x^3}
</math>:<math>
\frac{d^4 f}{d x^4}
=\frac{d^4 f}{dg^4} \left(\frac{dg}{dx}\right)^4
+ 6 \frac{d^3 f}{d g^3} \left(\frac{dg}{dx}\right)^2 \frac{d^2 g}{d x^2}
+ \frac{d^2 f}{d g^2} \left\{ 4 \frac{dg}{dx} \frac{d^3 g}{dx^3} + 3\left(\frac{d^2 g}{dx^2}\right)^2\right\}
+ \frac{df}{dg}\frac{d^4 g}{dx^4}
</math>
[edit] See also
af:Kettingreëlar:قاعدة السلسلةde:Kettenregeles:Regla de la cadenafr:Règle de dérivation en chaîneko:연쇄 법칙he:כלל השרשרתnl:Kettingregelpl:Reguła łańcuchowapt:Regra da cadeiafi:Ketjusääntösv:Kedjeregelnth:กฎลูกโซ่tr:Zincir kuralı

