Derivatives of vector-valued functions (computing Jacobians)

Main message

Derivatives of a scalar with respect to a scalar might be relatively straightforward. Derivatives of vector-valued functions are not impossibly difficult. You can use intelligent matrix and array operations to facilitate the process.

from IPython.display import YouTubeVideo; YouTubeVideo('S1Z8_B4tdTQ', width=1024, height=576)

Video transcript available on YouTube and here.

What are vector-valued functions?

A vector-valued function is any function that returns multiple values (outputs). These are quite common in engineering models; rarely do we only care about scalar inputs and scalar outputs to a model. Here are some examples of vector-valued functions in the wild:

  • structural stresses in different members in an aircraft wing

  • thickness of the tower for a wind turbine from base to top

  • thrust outputs of an engine at different operating points

  • altitude of an aircraft along different points in its trajectory

Intuitively, each of these quantities should be considered together, so it makes sense to contain them within a vector or array. You could consider each of them as scalar values, but it would be unreasonably difficult to track them through your model and perform computations efficiently.

In a general sense, think of a “vector” as including arbitrarily dimensioned arrays and tensors. Strictly mathematically, “vector-valued functions” only refer to functions that produce multidimensional outputs, but this lesson is relevant even in the case of multidimensional inputs and single-dimensional outputs. I’ll be covering the general case of arbitrarily sized inputs and outputs.

Here are some three examples of engineering functions that are not scalar-to-scalar.

Angle between two 3D vectors

Despite having 6 inputs, because the output is scalar, this is technically not a vector-valued function. It’s still quite a relevant engineering function. Don’t read too much into the proper definition of “vector-valued function” and what’s included or not, just know that in this lesson we’re talking about anything that’s not a scalar-to-scalar function.

\[\begin{split} \begin{align*} \bf{a} &= \begin{bmatrix} a_{x} \\ a_{y} \\ a_{z} \\ \end{bmatrix} \end{align*} \end{split}\]
\[\begin{split} \begin{align*} \bf{b} &= \begin{bmatrix} b_{x} \\ b_{y} \\ b_{z} \\ \end{bmatrix} \end{align*} \end{split}\]
\[ \theta = \cos^{-1}\biggl(\frac{{\bf a} \cdot {\bf b}}{|{\bf a}| |{\bf b}|}\biggr) \]

Exit velocity of a tennis ball in 2D

I’ve got a backyard tennis ball launcher where you can choose the launch angle and speed. Let’s ignore the 3-dimensional aspect of the world right now and think in 2D. This results in a two dimensional input (launch angle and speed) and a two dimensional output for the velocity (x and y components). This is a vector-valued function.

\[\begin{split} \begin{align*} {\bf v} &= \begin{bmatrix} v_{x} \\ v_{y} \\ \end{bmatrix} &= \begin{bmatrix} v \cos(\theta) \\ v \sin(\theta) \\ \end{bmatrix} \end{align*} \end{split}\]

Force-displacement history for an automotive shock absorber

Let’s say you’re a suspension engineer for a car company. You are trying to model the displacement of the suspension system across a car’s simulated route. Simplifying a lot of physics, we can obtain the displacement (\(x\)) of the shock absorber by knowing the force (\(F\)) acting on it and the spring constant (\(k\)): \(x = F/k\). Given a recording of force measurements from 1000 timepoints in a lab test, we can compute the corresponding shock displacement at each point.

This results in 1000 inputs and 1000 outputs and this is a vector-valued function.

Brief math theory of derivative arrays (Jacobians)

I’ll now detail some basic theory behind the derivatives of vector-valued functions. Sec. 6.1 in Engineering Design Optimization also presents this information.

In the case of a function \(f(x)\) where the input and output are both scalars, we get:

\[ \begin{equation} \underset{1\times 1}{\frac{\partial f}{\partial x}} \end{equation} \]

In the general case we have an array called the Jacobian which contains the gradient information for vector-valued functions. Its size is based on the number of inputs \(n_x\) and the number of outputs \(n_f\) as such:

\[\begin{split} J_{f}=\frac{\partial f}{\partial x}=\left[\begin{array}{c} \nabla f_{1}^\intercal \\ \vdots \\ \nabla f_{n_{f}}^\intercal \end{array}\right]=\underbrace{\left[\begin{array}{ccc} \frac{\partial f_{1}}{\partial x_{1}} & \cdots & \frac{\partial f_{1}}{\partial x_{n_{x}}} \\ \vdots & \ddots & \vdots \\ \frac{\partial f_{n_{f}}}{\partial x_{1}} & \cdots & \frac{\partial f_{n_{f}}}{\partial x_{n_{x}}} \end{array}\right]}_{\left(n_{f} \times n_{x}\right)} \end{split}\]

Here is a reproduction of Ex. 6.1 from Engineering Design Optimization to show how to obtain the derivatives of a simple vector-valued function.

\[\begin{split} f(x)=\left[\begin{array}{l} f_{1}\left(x_{1}, x_{2}\right) \\ f_{2}\left(x_{1}, x_{2}\right) \end{array}\right]=\left[\begin{array}{c} x_{1} x_{2}+\sin x_{1} \\ x_{1} x_{2}+x_{2}^{2} \end{array}\right] \end{split}\]

Differentiating symbolically, we get:

\[\begin{split} \frac{\partial f}{\partial x}=\left[\begin{array}{cc} x_{2}+\cos x_{1} & x_{1} \\ x_{2} & x_{1}+2 x_{2} \end{array}\right] \end{split}\]

Evaluating this at \(x=(\pi/4, 2)\) yields

\[\begin{split} \frac{\partial f}{\partial x}=\left[\begin{array}{ll} 2.707 & 0.785 \\ 2.000 & 4.785 \end{array}\right] \end{split}\]

You can think of computing the Jacobian entries as individually computing derivatives of scalars with respect to scalars and putting them in an array. But I would not suggest thinking that way, especially in more complex cases. It’s helpful to know that deep down all vector-valued functions are just scalar functions smashed together. Often times, though, there will be patterns in the derivatives that you should harness when computing Jacobians for vector-valued functions.