Chain Rule
Introduction
Our goal is to find the derivatives of compositions of functions such as \(\cos\left(e^{x^2}+3x\sin x\right)\) and \(\cos(x^2+9)\).
Composition of functions
We will first review the composition of functions. If you want to read more on this, please follow this link Functions in Mathematics.
Assume that \(u(x)=x^3+4x\) and \(v(x)=\cos x\). Then \(v\) is a function that maps a real number \(x\) to \(\cos x\). The function \(u\) maps any real number \(x\) to \(x^3+4x\). In particular \(u(7)=7^3+4\cdot 7\), \(u(\star)=\star^3+4\cdot \star\), and \(u(v(x))=(v(x))^3+4\cdot v(x)= \cos^3x+4\cos x\). The composition \(u(v(x))\) is another function and it maps \(x\) to \(\cos^3x+4\cos x\). If we call \(f(x)=\cos^3x+4\cos x\) we can talk about the derivative of \(f\).
Example
Assume that \(f(x)=2^x\) and \(g(x)=5\cdot x-6\). Let \(h\) be the function defined as \(h(x)=f(g(x))\) and let \(m\) be the function defined as \(m(x)=g(f(x))\). Determine \(h(3)\) and \(m(3)\).
We have that \(h(x)=f(g(x))\), hence \begin{eqnarray*}
h(x)&=& f(g(x))= 2^{g(x)}=2^{5x-6}.
\end{eqnarray*}
Therefore \[h(3)=2^{5\cdot 3-6}=2^{9}=512.\]
On the other hand, \[m(x)=g(f(x))=5\cdot f(x)-6=5\cdot 2^x-6,\] hence
\[m(3)=5\cdot 2^3-6=5\cdot 8-6=34.\]
Derivative of a composition of functions
We will now derive a formula for the derivative of a composition of functions \(f(x)=u(v(x))\) that involves the derivatives of \(u\) and \(v\). This formula is known as the chain rule.
Theorem (Chain rule)
Assume that \(u\) and \(v\) are two functions such that \(v\) is differentiable at point \(a\), and \(u\) is differentiable at point \(v(a)\). Then the function \(f(x)=u(v(x))\) is differentiable at \(a\) and its derivative satisfies: \[f^{\prime}(a)=u^{\prime}(v(a))\cdot v^{\prime}(a).\]
We will prove that the limit \(\lim_{h\to 0}\frac{f(a+h)-f(a)}{h}\) exists and is equal to \(u^{\prime}(v(a))\cdot v^{\prime}(a)\).
Our first step is to transform the limit in the following form:
\[\lim_{h\to 0}\frac{f(a+h)-f(a)}{h}=\lim_{h\to0}\frac{u(v(a+h))-u(v(a))}{h}=\lim_{h\to 0}\frac{u(v(a+h))-u(v(a))}{v(a+h)-v(a)}\cdot\frac{v(a+h)-v(a)}{h}.\]
The previous step is a bit problematic if \(v(a+h)-v(a)=0\) for infinitely many values of \(h\). This case can be handled with some care, but here we will assume that this is not the case. The previous limit will be calculated by calculating the following two limits and taking their product:
\[\lim_{h\to 0}\frac{u(v(a+h))-u(v(a))}{v(a+h)-v(a)}\;\;\;\;\;\mbox{and}\;\;\;\;\;\lim_{h\to 0}\frac{v(a+h)-v(a)}h.\]
The second limit can be recognized - that is \(v^{\prime}(a)\). For the first one we notice that continuity of \(v\) implies that \(v(a+h)-v(a)\) converges to \(0\). We may substitute \(\varepsilon =v(a+h)-v(a)\). Then the limit becomes:
\[\lim_{h\to 0}\frac{u(v(a+h))-u(v(a))}{v(a+h)-v(a)}=\lim_{\varepsilon \to 0}\frac{u(v(a)+\varepsilon)-u(v(a))}{\varepsilon}=u^{\prime}(v(a)),\]
by the definition of the limit of \(u\) at \(v(a)\). Therefore the function \(f\) is differentiable and \(f^{\prime}(a)=u^{\prime}(v(a))\cdot v^{\prime}(a)\).
Let us consider the following example.
Example Prove that the function \(f(x)=\cos(3x+x^2)\) is differentiable and find its derivative \(f^{\prime}(x)\) for \(x\in\mathbb R\).
We first want to express \(f\) as a composition of two functions and then apply the chain rule to the composition. If our expression for \(f\) is to be of the form \(f(x)=u(v(x))\) then \(v\) is the function that ``attacks \(x\) first.’’ We may choose \(v(x)=3x+x^2\). If we choose \(u(y)=\cos y\) the miracle happens: The composition of \(u(v(x))\) is precisely our function \(f\).
Therefore:
\[f^{\prime}(x)=u^{\prime}(v(x))\cdot v^{\prime}(x)=-\sin(v(x))\cdot v^{\prime}(x)=-\sin(3x+x^2)\cdot (3+2x).\]
Derivative of the inverse function
We will derive a formula that will help us in finding derivatives of \(\ln\) and inverse trigonometric functions.
Theorem (Derivative of inverse function)
Let \(f:\mathbb R\to\mathbb R\) be a continuously differentiable function at \(a\in\mathbb R\) for which \(f^{\prime}(a)\neq 0\). Then its inverse \(g(z)=f^{-1}(z)\) is differentiable at \(b=f^{-1}(a)\) and its derivative satisfies \[g^{\prime}(b)=\frac1{f^{\prime}\left(f^{-1}(b)\right)}.\]
We have that \(x=g(f(x))\) and using the chain rule we obtain \[1=x^{\prime}=\left(f(g(x))\right)^{\prime}=f^{\prime}(g(x))\cdot g^{\prime}(x).\]
This implies that \[g^{\prime}(x)=\frac1{f^{\prime}(g(x))}=\frac1{f^{\prime}\left(f^{-1}(x)\right)}.\]
We can now find the derivative of \(\ln\), \(\arcsin\), \(\arccos\), and \(\arctan\).
Let \(f(x)=e^x\). Then \(f^{-1}(x)=\ln x\) and according to the previous theorem we have \[\left(\ln x\right)^{\prime}=\frac1{f^{\prime}(\ln x)}=\frac1{e^{\ln x}}=\frac1x.\]
Let \(f(x)=\sin x\). Then \(f^{-1}(x)=\arcsin x\), hence \[\left(\arcsin x\right)^{\prime}=\frac1{f^{\prime}(\arcsin x)}=\frac1{\cos(\arcsin x)}=\frac1{\sqrt{1-x^2}},\;\; \;\mbox{for }x\in(-1,1).\]
Practice problems
Problem 1. If \(u(x)=\sin x\) and \(v(x)=x^2\), determine the \(u(v(x))\).
The composition of the functions \(u\) and \(v\) is: \(u(v(x))=u(x^2)=\sin(x^2)\).
Problem 2. If \(u(x)= x^3\) and \(v(x)=x+4\), and \(w(x)=\cos (x+1)\), determine the \(u(v(w(x)))\).
The composition of these functions is: \[ u(v(w(x)))=u(v(\cos(x+1)))=u(\cos(x+1)+4)=\left(\cos(x+1)+4\right)^3.\]
Problem 3. Find the derivative of the function \(f(x)=\cos(x^3)\).
We first write \(f(x)=u(v(x))\) where \(v(x)=x^3\) and \(u(y)=\cos y^3\). Then we have \(f^{\prime}(x)=v^{\prime}(u(x))\cdot v^{\prime}(x)=-\sin(u(x)^3)\cdot v^{\prime}(x)=-\sin(x^3)\cdot 3x^2\).
Problem 4. Find the derivative of the function \(f(x)=e^{-\frac{x^2}2}\).
We express \(f\) as a composition of functions \(f(x)=u(v(x))\) where \(v(x)=-\frac{x^2}2\) and \(u(x)=e^x\).
Then we have \[f^{\prime}(x)=u^{\prime}(v(x))\cdot v^{\prime}(x)=e^{v(x)}\cdot \left(-\frac{x^2}2\right)^{\prime}=e^{v(x)}\cdot (-x)=-xe^{-\frac{x^2}2}.\]
Problem 5. Assume that \(u\) and \(v\) are the functions that satisfy: \(u(0)=1\), \(u(1)=2\), \(u(2)=3\), \(v(0)=2\), \(v(1)=3\), \(v(2)=4\), \(u^{\prime}(0)=4\), \(u^{\prime}(1)=5\), \(u^{\prime}(2)=6\), \(v^{\prime}(0)=7\), \(v^{\prime}(1)=8\), \(v^{\prime}(2)=9\). Let \(f(x)=u(v(x))\). Find \(f^{\prime}(0)\).
Using the chain rule we obtain \(f^{\prime}(x)=u^{\prime}(v(x))\cdot v^{\prime}(x)\), hence \[f^{\prime}(0)=u^{\prime}(v(0))\cdot v^{\prime}(0)=u^{\prime}(2)\cdot 7=6\cdot 7=42.\] _i_Remark_/i_This problem had given us extra useless data - don’t let that confuse you.