Derivatives of vector-valued functions (computing Jacobians)¶

Main message¶

Derivatives of a scalar with respect to a scalar might be relatively straightforward. Derivatives of vector-valued functions are not impossibly difficult. You can use intelligent matrix and array operations to facilitate the process.

from IPython.display import YouTubeVideo; YouTubeVideo('S1Z8_B4tdTQ', width=1024, height=576)

Video transcript available on YouTube and here.

What are vector-valued functions?¶

A vector-valued function is any function that returns multiple values (outputs). These are quite common in engineering models; rarely do we only care about scalar inputs and scalar outputs to a model. Here are some examples of vector-valued functions in the wild:

structural stresses in different members in an aircraft wing
thickness of the tower for a wind turbine from base to top
thrust outputs of an engine at different operating points
altitude of an aircraft along different points in its trajectory

Intuitively, each of these quantities should be considered together, so it makes sense to contain them within a vector or array. You could consider each of them as scalar values, but it would be unreasonably difficult to track them through your model and perform computations efficiently.

In a general sense, think of a “vector” as including arbitrarily dimensioned arrays and tensors. Strictly mathematically, “vector-valued functions” only refer to functions that produce multidimensional outputs, but this lesson is relevant even in the case of multidimensional inputs and single-dimensional outputs. I’ll be covering the general case of arbitrarily sized inputs and outputs.

Here are some three examples of engineering functions that are not scalar-to-scalar.

Angle between two 3D vectors¶

Despite having 6 inputs, because the output is scalar, this is technically not a vector-valued function. It’s still quite a relevant engineering function. Don’t read too much into the proper definition of “vector-valued function” and what’s included or not, just know that in this lesson we’re talking about anything that’s not a scalar-to-scalar function.

\[\begin{split} \begin{align*} \bf{a} &= \begin{bmatrix} a_{x} \\ a_{y} \\ a_{z} \\ \end{bmatrix} \end{align*} \end{split}\]

\[\begin{split} \begin{align*} \bf{b} &= \begin{bmatrix} b_{x} \\ b_{y} \\ b_{z} \\ \end{bmatrix} \end{align*} \end{split}\]

\[ \theta = \cos^{-1}\biggl(\frac{{\bf a} \cdot {\bf b}}{|{\bf a}| |{\bf b}|}\biggr) \]

Exit velocity of a tennis ball in 2D¶

I’ve got a backyard tennis ball launcher where you can choose the launch angle and speed. Let’s ignore the 3-dimensional aspect of the world right now and think in 2D. This results in a two dimensional input (launch angle and speed) and a two dimensional output for the velocity (x and y components). This is a vector-valued function.

\[\begin{split} \begin{align*} {\bf v} &= \begin{bmatrix} v_{x} \\ v_{y} \\ \end{bmatrix} &= \begin{bmatrix} v \cos(\theta) \\ v \sin(\theta) \\ \end{bmatrix} \end{align*} \end{split}\]

Force-displacement history for an automotive shock absorber¶

Let’s say you’re a suspension engineer for a car company. You are trying to model the displacement of the suspension system across a car’s simulated route. Simplifying a lot of physics, we can obtain the displacement (\(x\)) of the shock absorber by knowing the force (\(F\)) acting on it and the spring constant (\(k\)): \(x = F/k\). Given a recording of force measurements from 1000 timepoints in a lab test, we can compute the corresponding shock displacement at each point.

This results in 1000 inputs and 1000 outputs and this is a vector-valued function.

Brief math theory of derivative arrays (Jacobians)¶

I’ll now detail some basic theory behind the derivatives of vector-valued functions. Sec. 6.1 in Engineering Design Optimization also presents this information.

In the case of a function \(f(x)\) where the input and output are both scalars, we get:

\[ \begin{equation} \underset{1\times 1}{\frac{\partial f}{\partial x}} \end{equation} \]

In the general case we have an array called the Jacobian which contains the gradient information for vector-valued functions. Its size is based on the number of inputs \(n_x\) and the number of outputs \(n_f\) as such:

\[\begin{split} J_{f}=\frac{\partial f}{\partial x}=\left[\begin{array}{c} \nabla f_{1}^\intercal \\ \vdots \\ \nabla f_{n_{f}}^\intercal \end{array}\right]=\underbrace{\left[\begin{array}{ccc} \frac{\partial f_{1}}{\partial x_{1}} & \cdots & \frac{\partial f_{1}}{\partial x_{n_{x}}} \\ \vdots & \ddots & \vdots \\ \frac{\partial f_{n_{f}}}{\partial x_{1}} & \cdots & \frac{\partial f_{n_{f}}}{\partial x_{n_{x}}} \end{array}\right]}_{\left(n_{f} \times n_{x}\right)} \end{split}\]

Here is a reproduction of Ex. 6.1 from Engineering Design Optimization to show how to obtain the derivatives of a simple vector-valued function.

\[\begin{split} f(x)=\left[\begin{array}{l} f_{1}\left(x_{1}, x_{2}\right) \\ f_{2}\left(x_{1}, x_{2}\right) \end{array}\right]=\left[\begin{array}{c} x_{1} x_{2}+\sin x_{1} \\ x_{1} x_{2}+x_{2}^{2} \end{array}\right] \end{split}\]

Differentiating symbolically, we get:

\[\begin{split} \frac{\partial f}{\partial x}=\left[\begin{array}{cc} x_{2}+\cos x_{1} & x_{1} \\ x_{2} & x_{1}+2 x_{2} \end{array}\right] \end{split}\]

Evaluating this at \(x=(\pi/4, 2)\) yields

\[\begin{split} \frac{\partial f}{\partial x}=\left[\begin{array}{ll} 2.707 & 0.785 \\ 2.000 & 4.785 \end{array}\right] \end{split}\]

You can think of computing the Jacobian entries as individually computing derivatives of scalars with respect to scalars and putting them in an array. But I would not suggest thinking that way, especially in more complex cases. It’s helpful to know that deep down all vector-valued functions are just scalar functions smashed together. Often times, though, there will be patterns in the derivatives that you should harness when computing Jacobians for vector-valued functions.

Derivatives of vector-valued functions (computing Jacobians)

Contents