Mathematical preliminaries — a toolbox for manifold mechanics

To carry Newton’s laws past the flat plane we need a new vocabulary — bases, the matrix exponential, and the shortest path to tangent vectors.

Opening

This book is a study note that rewrites Lagrangian and Hamiltonian mechanics in the language of manifolds. That work starts in chapter 1, but a few tools should already be in the reader’s hand. By the end of this prologue the reader should be able to write “change of basis” in a single line of indices and explain in one sentence why the matrix exponential etAe^{tA} is our first example of a flow. The tangent space of chapter 4 and the vector fields of chapter 5 will then not feel like new abstractions but as formalizations of the pictures sketched here.

Main 1 — Why we are setting up a toolbox

Newton’s F=mx¨\mathbf{F} = m\ddot{\mathbf{x}} rests on the silent assumption that the position vector lives in R3\mathbb{R}^3. For the textbook pendulum, planetary motion, and collision problems this is enough. But consider the double pendulum. Its coordinates are the two rod angles (θ1,θ2)(\theta_1, \theta_2), and the coordinate space is not a plane but the torus T2=S1×S1T^2 = S^1 \times S^1. The values θ1=0\theta_1 = 0 and θ1=2π\theta_1 = 2\pi name the same point — something ordinary planar calculus refuses to acknowledge.

In the same way, the orientation of a rigid body lives on the rotation group SO(3)SO(3), a curved surface, and a particle constrained to a sphere lives on S2S^2. The common name for such spaces is manifold — a space that looks locally like Rn\mathbb{R}^n but globally does not. The formal definition is deferred to chapter 4, but the premise of this book is simple: if the configuration space is not flat, the vocabulary of calculus must be rewritten in a coordinate-independent way. The price we pay is one more pass over the basics of linear algebra.

Main 2 — Linear algebra refresher

A basis of a vector space VV is a set of vectors that is linearly independent and spans VV. Once a basis {e1,,en}\{e_1, \dots, e_n\} is fixed, every vector vVv \in V has unique components (v1,,vn)(v^1, \dots, v^n). Throughout this book we use the Einstein summation convention — when the same index appears once up and once down, sum over it. So

v=vieii=1nvieiv = v^i e_i \equiv \sum_{i=1}^{n} v^i e_i

is all we need to write. Upper indices mark components (contravariant), lower indices mark basis vectors (covariant). If a new basis {ei}\{e_i'\} is given by ei=A ijeje_i' = A^j_{\ i} e_j, then the new components viv'^i of the same vector vv are obtained by multiplying with the inverse matrix (A1) ji(A^{-1})^i_{\ j} — this is change of basis, and the starting point of the tensor concept.

The next tool is eigenvalues and eigenvectors. For an n×nn \times n matrix AA, if a nonzero vv satisfies Av=λvA v = \lambda v, then λC\lambda \in \mathbb{C} is an eigenvalue of AA. Eigenvalues tell us the coordinates in which the matrix decomposes into pure stretching and rotation.

Now the protagonist of this chapter — the matrix exponential. It is just the scalar series ex=1+x+x2/2!+e^x = 1 + x + x^2/2! + \cdots ported to matrices. For an n×nn \times n matrix AA,

etA=I+tA+(tA)22!+(tA)33!+e^{tA} = I + tA + \frac{(tA)^2}{2!} + \frac{(tA)^3}{3!} + \cdots

The series converges absolutely for every AA. The linear ODE x˙=Ax\dot x = Ax has solution — accept this as fact for now — x(t)=etAx0x(t) = e^{tA} x_0. The point to underline: applying etAe^{tA} to x0x_0 as tt varies, i.e. letting the system run for time tt, is the first concrete example of what we will later call a flow. etAe^{tA} is a one-parameter family of linear maps indexed by R\mathbb{R}, and the cleanest specimen of the picture that chapter 5 will generalize.

Main 3 — Tangent vectors, intuitively

On a plane, a vector is the same vector no matter where you place it. Parallel transport is free. On the sphere S2S^2 the story changes. If you take an arrow at the equator pointing east and try to drag it to the north pole, it is not obvious in which direction the arrow should end up pointing.

The fix is to give each point its own vector space. A tangent vector at the point pp is, intuitively, “a velocity the surface allows at pp.” All tangent vectors at pp together form a vector space called the tangent space TpMT_p M. The tangent space at the north pole of S2S^2, written TNS2T_{\text{N}} S^2, is just the horizontal plane tangent to the sphere there — exactly the picture you would draw.

A touch more formally: take a curve γ:(ε,ε)M\gamma : (-\varepsilon, \varepsilon) \to M through pp with γ(0)=p\gamma(0) = p. Its velocity at time zero, γ˙(0)\dot\gamma(0), is one tangent vector. In coordinates (x1,,xn)(x^1, \dots, x^n) we can write γ˙(0)=γ˙i(0)ip\dot\gamma(0) = \dot\gamma^i(0)\, \partial_i \big|_p, and the set {ip}\{\partial_i|_p\} plays the role of a basis for TpMT_p M. Notice that this is formally identical to v=vieiv = v^i e_i from Main 2.

For this chapter, the picture and the vocabulary are enough. The formal definition, the equivalence of different definitions, and basis changes are deferred to chapter 4. But one thing should be nailed in now: a vector field is a smooth assignment of one tangent vector to each point in space, and letting points slide along it is what produces a flow. etAe^{tA} is just the flat-space special case of that picture.

In Python

# Check that the matrix exponential really produces a rotation.
# A = [[0,-1],[1,0]] is the generator of planar rotation;
# applying e^{tA} to x0 = (1,0) should trace the unit circle.
import numpy as np
import matplotlib.pyplot as plt

A = np.array([[0.0, -1.0], [1.0, 0.0]])
x0 = np.array([1.0, 0.0])

def expm_series(M, terms=20):
    # Truncated Taylor series: accurate for moderate ||M||.
    n = M.shape[0]
    result = np.eye(n)
    term = np.eye(n)
    for k in range(1, terms):
        term = term @ M / k
        result = result + term
    return result

ts = np.linspace(0.0, 2 * np.pi, 200)
xs = np.array([expm_series(t * A) @ x0 for t in ts])

# Compare with the closed form (cos t, sin t).
closed = np.array([[np.cos(t), np.sin(t)] for t in ts])
err = np.max(np.abs(xs - closed))
print(f"max error between series and closed form = {err:.2e}")

plt.plot(xs[:, 0], xs[:, 1])
plt.gca().set_aspect("equal")
plt.title(r"flow traced by $e^{tA} x_0$")
plt.show()

If the error falls to the order of 101010^{-10} and the plot is a unit circle, the definition ”etAe^{tA} is the flow that acts on an initial vector for a time tt” should now feel concrete.

To the next chapter

Chapter 1: Equations of motion rewrites Newton’s F=mx¨\mathbf{F} = m\ddot{\mathbf{x}} in generalized coordinates qiq^i and watches the Lagrangian emerge naturally. The index notation and the flow picture assembled in this chapter will serve as the working vocabulary for that derivation.