Embedding Circles — Steve Trettel

A case study in geometric dimension reduction

Notes from ongoing work with Fabian Lander.

In math, high-dimensional spaces are as accessible as low-dimensional ones, and so it’s quite common for things to end up naturally living in a high-dimensional home. But for humans they aren’t accessible at all — and even low-dimensional shapes that live in such worlds can be hard to understand.

So we naturally want to pluck such objects out of their high-dimensional homes and squish them down into the cramped three-dimensional world we live in. This isn’t always possible — many dimension-reduction techniques across data science aim only to capture a shadow of the real picture — but it’s natural to want to try.

For me this kind of thing comes up all the time, in particular right now in joint work with Aaron Abrams, Dave Bachmann, and Edmund Harriss on configuration spaces. The cases we care about there are spheres sitting in $\mathbb R^6$ , and the geometric data we want to preserve includes both intrinsic and extrinsic information. Me and Fabian Lander hope to write some software to accomplish this. But like always, I want to start simply, and really understand things. So these notes are about the simplest analog that has all the features: given a circle in $\mathbb R^3$ , can we find a ‘best geometric approximating’ embedding into $\mathbb R^2$ ?

The geometry of a curve

A parametrized curve in $\mathbb R^n$ is a smooth immersion

\gamma \colon S^1 \;\to\; \mathbb R^n, \qquad |\gamma'(\theta)| > 0,

where $S^1 = \mathbb R / 2\pi \mathbb Z$ is a fixed parameter circle. We do not require $\gamma$ to be injective: dropping injectivity costs us nothing (gradient flow can pass through self-intersections without incident), while dropping the immersion condition would cost us a lot, since the unit tangent, arclength derivative, and curvature data below all become singular at points where $\gamma' = 0$ .

Two parametrized curves $\gamma$ and $\gamma \circ \phi$ related by an orientation-preserving diffeomorphism $\phi$ of $S^1$ trace out the same path at different speeds; an unparametrized curve is the equivalence class $[\gamma]$ . Within each class the arclength parametrization with $|\gamma'| \equiv L/2\pi$ is canonical up to the choice of basepoint and orientation.

The first fundamental form

Pulling the Euclidean inner product on $\mathbb R^n$ back along $\gamma$ gives a Riemannian metric on $S^1$ ,

g \;=\; \gamma^* \langle \cdot, \cdot \rangle_{\mathbb R^n} \;=\; v(\theta)^2\, d\theta^2, \qquad v(\theta) := |\gamma'(\theta)|.

The function $v$ is the speed, the 1-form $ds = v\,d\theta$ is the arclength element, and

L \;=\; \int_{S^1} v(\theta)\, d\theta

is the total length. As a metric on the fixed parameter circle, the function $v$ is the geometric data attached to a parametrized curve.

For an unparametrized curve nothing is left of $v$ except $L$ : any two metrics on $S^1$ with the same total length are isometric, so the only invariant of $[\gamma]$ visible to the first fundamental form is the single real number $L$ . The intrinsic data of an unparametrized curve is one number; the intrinsic data of a parametrized curve is a function.

The second fundamental form

The intrinsic data sees $\gamma$ as an abstract Riemannian circle and is blind to how $\gamma$ sits inside $\mathbb R^n$ . The extrinsic data captures the bending, and to define it cleanly we need to differentiate vector fields along $\gamma$ .

A vector field along $\gamma$ is a smooth section of the pullback bundle $\gamma^* T\mathbb R^n$ — concretely, a smooth $\mathbb R^n$ -valued function on $S^1$ . The first such field is the velocity $\gamma'(\theta) = \gamma_*\partial_\theta$ , and normalizing gives the unit tangent

T(\theta) \;=\; \frac{\gamma'(\theta)}{v(\theta)} \;\in\; \mathbb R^n.

The pullback of the flat Euclidean connection on $\mathbb R^n$ differentiates such a vector field componentwise; rescaling by $1/v$ converts $\theta$ -derivatives into arclength derivatives,

\nabla_s f \;=\; \frac{1}{v}\, \frac{df}{d\theta}.

The curvature vector is the arclength rate of change of the unit tangent,

\vec\kappa \;=\; \nabla_s T \;\in\; \mathbb R^n.

Differentiating $\langle T, T\rangle = 1$ forces $\vec\kappa \perp T$ , so $\vec\kappa$ takes values in the rank- $(n-1)$ normal bundle of $\gamma$ . This is the second fundamental form of $\gamma$ in the standard sense: for an immersion $\gamma \colon M \to \mathbb R^n$ of any source dimension, $\mathrm{II}(X, Y) = (\nabla^{\mathbb R^n}_X Y)^\perp$ is a symmetric normal-bundle-valued tensor; for a curve, $TM$ has rank one and $\mathrm{II}(T, T) = \vec\kappa$ determines the whole tensor.

The vector field $\vec\kappa$ along $\gamma$ contains all the information about how $\gamma$ bends in its ambient space. Together with $v$ it is enough to recover $\gamma$ up to translation: integrating $T$ recovers $\gamma$ from $T$ , and $T$ in turn is determined by $\vec\kappa$ via $T(s) = T(0) + \int_0^s \vec\kappa\,ds'$ once a single value $T(0)$ is fixed.

For an unparametrized curve the natural object is $\vec\kappa$ as a function of arclength on $\mathbb R / L\mathbb Z$ , well-defined up to the basepoint and orientation gauge.

Complete invariants: Frenet–Serret

The curvature vector $\vec\kappa$ has its own orientation in $\mathbb R^n$ and is not invariant under rigid motion of the ambient space. For many purposes we want isometry-invariant extrinsic data — scalar functions built from $\gamma$ that are unchanged by translation and rotation of $\mathbb R^n$ .

The simplest such scalar is $|\vec\kappa|$ . But this alone is not enough to determine $\gamma$ up to rigid motion in $n \geq 3$ : a unit circle in a plane and an appropriately scaled circular helix are non-congruent space curves with the same constant $|\vec\kappa|$ . The classical resolution is the Frenet–Serret apparatus: take successive arclength derivatives $T, T', T'', \ldots, T^{(n-1)}$ of the unit tangent and Gram–Schmidt them into an orthonormal moving frame $(e_1, \ldots, e_n)$ along $\gamma$ , with $e_1 = T$ . The structure equations of this frame,

\nabla_s e_i \;=\; -\kappa_{i-1}\, e_{i-1} + \kappa_i\, e_{i+1},

involve $n - 1$ scalar functions $\kappa_1, \ldots, \kappa_{n-1}$ , the generalized curvatures. By construction $\kappa_1, \ldots, \kappa_{n-2} > 0$ (Gram–Schmidt norms), and $\kappa_{n-1}$ is signed.

Fundamental theorem. A curve $\gamma \colon S^1 \to \mathbb R^n$ with $\kappa_1, \ldots, \kappa_{n-2} > 0$ everywhere is determined up to rigid motion of $\mathbb R^n$ by the data $(v, \kappa_1, \ldots, \kappa_{n-1})$ on $S^1$ . For an unparametrized curve, the data $(L, \kappa_1(s), \ldots, \kappa_{n-1}(s))$ as functions of arclength determines $[\gamma]$ up to rigid motion.

For $n = 2$ there is just $\kappa_1 = \kappa$ , the signed scalar curvature; for $n = 3$ , $\kappa_1 = |\vec\kappa|$ is the curvature and $\kappa_2 = \tau$ is the torsion.

So the extrinsic geometry of a curve admits two natural packagings: the curvature vector $\vec\kappa$ , which captures the bending up to translation, and the generalized curvatures $(\kappa_1, \ldots, \kappa_{n-1})$ , which capture it up to full rigid motion at the cost of a more elaborate construction.

The space of curves

The geometric story of the previous section was about individual curves. To do optimization — to look for a curve with prescribed geometric data, by varying it continuously — we need to step up a level and treat the collection of all curves as a space we can do geometry on.

Two function spaces sit in the picture, both open subsets of $C^\infty(S^1, \mathbb R^n)$ :

\mathrm{Emb}(S^1, \mathbb R^n) \;\subset\; \mathrm{Imm}(S^1, \mathbb R^n) \;\subset\; C^\infty(S^1, \mathbb R^n),

the space of embedded curves (immersions that are also injective) inside the space of immersed curves. Both inclusions are open: injectivity and the immersion condition are both stable under small perturbations. As open subsets of $C^\infty(S^1, \mathbb R^n)$ , both inherit the structure of an infinite-dimensional Fréchet manifold, with tangent space at any $\gamma$ canonically identified with $C^\infty(S^1, \mathbb R^n)$ itself — vector fields along $\gamma$ . We will work in $\mathrm{Imm}$ .

A Fréchet space is a complete topological vector space whose topology is given by a countable family of seminorms; the standard example, and the only one we need, is $C^\infty(S^1, \mathbb R^n)$ with seminorms $\sup_\theta |\partial_\theta^k \gamma|$ for $k = 0, 1, 2, \ldots$ . We need the Fréchet rather than the Banach setting because no single norm captures all derivatives of a smooth function — controlling smoothness requires the whole tower of $C^k$ norms at once. A Fréchet manifold is locally modeled on Fréchet spaces in the usual way (charts, smooth transition maps); the one fact we will use is that an open subset of a Fréchet space is itself a Fréchet manifold, with tangent space at every point canonically the ambient Fréchet space.

The orientation-preserving reparametrization group $\mathrm{Diff}^+(S^1)$ acts on both spaces by precomposition, $\gamma \mapsto \gamma \circ \phi$ , and the orbits are precisely the unparametrized curves. The action is free on $\mathrm{Emb}$ : an embedded curve is injective, so $\gamma \circ \phi = \gamma$ forces $\phi = \mathrm{id}$ . On $\mathrm{Imm}$ it is almost free — multiply-covered curves have nontrivial stabilizer. For example, $\gamma(\theta) = (\cos 2\theta, \sin 2\theta)$ wraps the unit circle twice, and the half-rotation $\phi(\theta) = \theta + \pi$ sends $\gamma$ to itself. The stabilizer of a $k$ -fold cover is the cyclic group $\mathbb Z/k$ generated by $\theta \mapsto \theta + 2\pi/k$ .

Consequently $\mathrm{Emb} \to \mathrm{Emb}/\mathrm{Diff}^+(S^1)$ is a principal $\mathrm{Diff}^+(S^1)$ -bundle, with the base a smooth (Fréchet) manifold of unparametrized embedded curves. For immersions, $\mathrm{Imm}/\mathrm{Diff}^+(S^1)$ is an infinite-dimensional orbifold, with orbifold points at the multiply-covered curves.

Upstairs or downstairs? The base is what we geometrically care about — each point an honest unparametrized curve. But the total space is a function space, and that is what gives us Fourier expansion, gradient flow, autodiff, and pointwise comparison of two curves. We work in $\mathrm{Imm}$ and carry the reparametrization gauge along as the price.

Tangent vectors. A tangent vector at $\gamma \in \mathrm{Imm}$ is the derivative of a one-parameter family $\gamma_\epsilon$ of curves passing through $\gamma$ at $\epsilon = 0$ ,

h(\theta) \;=\; \frac{d}{d\epsilon}\bigg|_{\epsilon = 0} \gamma_\epsilon(\theta).

This $h$ is a vector field along $\gamma$ — at each parameter value $\theta$ , a vector $h(\theta) \in \mathbb R^n$ recording the infinitesimal displacement of the curve there. So the tangent space is $T_\gamma\mathrm{Imm} = C^\infty(S^1, \mathbb R^n)$ , the same Fréchet space the manifold is modeled on.

Inside $T_\gamma\mathrm{Imm}$ there is a natural splitting into tangential and normal parts,

h \;=\; h^T + h^\perp, \qquad h^T = \langle h, T\rangle T,

where $h^T$ is parallel to the curve at each point and $h^\perp$ is perpendicular. Tangential variations are infinitesimal reparametrizations — they slide the parametrization along the existing curve without changing its image — and so generate the gauge action of $\mathrm{Diff}^+(S^1)$ on $\mathrm{Imm}$ . Normal variations are honest geometric deformations that move the curve in $\mathbb R^n$ .

Riemannian metrics

A Riemannian metric on $\mathrm{Imm}$ is, as on any manifold, a smoothly-varying choice of inner product on each tangent space. Concretely it assigns to each $\gamma$ a positive-definite symmetric bilinear form

G_\gamma \colon C^\infty(S^1, \mathbb R^n) \times C^\infty(S^1, \mathbb R^n) \;\to\; \mathbb R.

There are several natural choices; we discuss two.

The $d\theta$ -metric

The most direct construction is to integrate the Euclidean inner product against the parameter measure,

G^\theta_\gamma(h, k) \;=\; \int_{S^1} \langle h(\theta), k(\theta)\rangle_{\mathbb R^n}\, d\theta.

The integrand depends on $h$ and $k$ but the integration weight does not depend on $\gamma$ — every tangent space carries the same inner product. So $\mathrm{Imm}$ with this metric is just an open subset of the Hilbert-like space $C^\infty(S^1, \mathbb R^n)$ with its $L^2$ inner product, and is in particular flat: the Levi-Civita connection is the trivial one, geodesics are straight lines in $C^\infty(S^1, \mathbb R^n)$ , and tangent vectors at different points can be identified.

The $d\theta$ -metric is not reparametrization-invariant. Under $\gamma \mapsto \gamma \circ \phi$ , a tangent vector $h$ at $\gamma$ corresponds to $h \circ \phi$ at $\gamma \circ \phi$ , and

G^\theta_{\gamma \circ \phi}(h \circ \phi,\, k \circ \phi) \;=\; \int \langle h \circ \phi,\, k \circ \phi\rangle\, d\theta \;=\; \int \langle h,\, k\rangle\, \frac{d\theta'}{\phi'} \;\neq\; G^\theta_\gamma(h, k)

in general. So this metric is honestly a metric on the space of parametrized curves, but does not descend to the quotient $\mathrm{Imm}/\mathrm{Diff}^+(S^1)$ — it would assign different geometry to two reparametrizations of the same unparametrized curve.

The $L^2$ metric

A reparametrization-invariant alternative integrates against the arclength element,

G^{L^2}_\gamma(h, k) \;=\; \int_{S^1} \langle h(\theta), k(\theta)\rangle_{\mathbb R^n}\, ds_\gamma \;=\; \int_{S^1} \langle h, k\rangle\, v_\gamma\, d\theta.

The integration weight now depends on $\gamma$ through $v_\gamma$ , so $\mathrm{Imm}$ is no longer flat in this metric. Under $\gamma \mapsto \gamma \circ \phi$ , the arclength element $ds$ also transforms, and the change-of-variables computation gives $G^{L^2}_{\gamma \circ \phi}(h \circ \phi,\, k \circ \phi) = G^{L^2}_\gamma(h, k)$ — the metric is equivariant, and descends to a Riemannian metric on the quotient (away from orbifold points). This is the right choice if we want to do geometry on unparametrized curves: it sees only what’s intrinsic to the orbit.

Which one we use

Both metrics are honest Riemannian structures on $\mathrm{Imm}$ , but they have different characters. $G^{L^2}$ is the geometric choice — it lifts a metric from the orbit space and treats reparametrizations as the gauge they are. $G^\theta$ is the function-space choice — it treats $\mathrm{Imm}$ as an open subset of a Hilbert space and benefits from being flat.

For our problem, the loss $\mathcal L$ we’ll write down in §3 is itself parametrization-dependent: it pins down a parametrization correspondence between $\gamma_0$ and the candidate $\gamma$ . So we are not actually solving for an unparametrized curve; we are solving for a parametrized one.

Derivatives

A smooth function $\mathcal F \colon \mathrm{Imm} \to \mathbb R$ has, at each $\gamma$ , a differential — a linear functional on the tangent space measuring the rate of change of $\mathcal F$ along an infinitesimal variation. We work this out abstractly in this section, then specialize to gradients (which require a metric) and compute one example all the way through.

The differential

Fix $\gamma \in \mathrm{Imm}$ and $h \in T_\gamma\mathrm{Imm}$ . The differential of $\mathcal F$ at $\gamma$ is the linear map

d\mathcal F_\gamma \colon T_\gamma \mathrm{Imm} \to \mathbb R, \qquad d\mathcal F_\gamma(h) \;=\; \frac{d}{d\epsilon}\bigg|_{\epsilon = 0} \mathcal F(\gamma + \epsilon h).

The right-hand side does not depend on which family realizing $h$ we use; any $\gamma_\epsilon$ with $\gamma_0 = \gamma$ and $\frac{d}{d\epsilon}|_0\gamma_\epsilon = h$ gives the same value. The differential is defined without reference to any inner product — it is the rate of change of $\mathcal F$ along an infinitesimal variation, full stop.

The same construction applied to vector- or tensor-valued functions on $\mathrm{Imm}$ produces vector- or tensor-valued 1-forms, and we use $d$ for all of these. So for example $v \colon \mathrm{Imm} \to C^\infty(S^1, \mathbb R_{>0})$ has a differential $dv_\gamma(h) \in C^\infty(S^1, \mathbb R)$ , computed by taking $\frac{d}{d\epsilon}|_0$ of $v_{\gamma_\epsilon}$ .

Example: total length. The total length functional $L \colon \mathrm{Imm} \to \mathbb R$ , $\gamma \mapsto \int_{S^1} v_\gamma\, d\theta$ , is a smooth function on $\mathrm{Imm}$ . To compute its differential at $\gamma$ , evaluate on the family $\gamma + \epsilon h$ :

L(\gamma + \epsilon h) \;=\; \int_{S^1} v_{\gamma + \epsilon h}(\theta)\, d\theta.

The integration variable $\theta$ and the variation parameter $\epsilon$ are independent, so $\frac{d}{d\epsilon}|_0$ pulls inside the integral as $\frac{\partial}{\partial \epsilon}|_0$ on the integrand:

dL_\gamma(h) \;=\; \int_{S^1} dv_\gamma(h)\, d\theta,

where $dv_\gamma(h) := \frac{d}{d\epsilon}|_0 v_{\gamma + \epsilon h}$ is the pointwise differential of $v$ . To compute it, square both sides of $v = |\gamma'|$ and differentiate $v_\epsilon^2 = |\gamma'|^2 + 2\epsilon \langle \gamma', h'\rangle + \epsilon^2|h'|^2$ at $\epsilon = 0$ : the LHS gives $2v\, dv(h)$ , the RHS gives $2\langle \gamma', h'\rangle$ , and so

dv_\gamma(h) \;=\; \frac{\langle \gamma', h'\rangle}{v} \;=\; \langle T, h'\rangle.

Substituting back,

dL_\gamma(h) \;=\; \int_{S^1} \langle T, h'\rangle\, d\theta.

This is the differential of the length functional, evaluated on any tangent vector $h$ .

Gradients

A differential $d\mathcal F_\gamma$ is a 1-form on $\mathrm{Imm}$ — a linear functional on the tangent space — and to turn it into a vector field, a tangent vector at each point pointing in the direction of steepest ascent, we need a metric. Given the $d\theta$ -metric $G$ , the gradient $\mathrm{grad}\,\mathcal F$ is the unique tangent vector at each $\gamma$ satisfying

G_\gamma(\mathrm{grad}\,\mathcal F, h) \;=\; d\mathcal F_\gamma(h) \qquad \text{for every } h \in T_\gamma\mathrm{Imm}.

The differential is metric-independent; the gradient is not. We work throughout with the $d\theta$ -metric, so “the gradient” means the $d\theta$ -gradient unambiguously.

Example: the gradient of total length. Continuing the example, under the $d\theta$ -metric the gradient is determined by $\int \langle \mathrm{grad}\,L, h\rangle\, d\theta = dL_\gamma(h)$ . Integration by parts on the parameter circle (no boundary terms, since $S^1$ is closed) gives

dL_\gamma(h) \;=\; \int_{S^1} \langle T, h'\rangle\, d\theta \;=\; -\int_{S^1} \langle T', h\rangle\, d\theta,

and reading off,

\mathrm{grad}\,L \;=\; -T' \;=\; -v\,\vec\kappa.

The gradient is the curvature vector scaled by speed and pointed inward; the gradient flow $\dot\gamma = -\mathrm{grad}\,L = v\,\vec\kappa$ shrinks the curve along its inward normal at a rate proportional to local speed and curvature. This is the parametrization-dependent version of curve-shortening flow: the canonical reparametrization-invariant CSF is $\dot\gamma = \vec\kappa$ , arising from the $G^{L^2}$ metric instead. The factor of $v$ here is the price of working in $G^\theta$ .

Optimization

We are given a fixed $\gamma_0 \in \mathrm{Imm}(S^1, \mathbb R^n)$ and want to find a planar curve $\gamma \in \mathrm{Imm}(S^1, \mathbb R^2)$ whose geometric data — first and second fundamental form — matches that of $\gamma_0$ as closely as possible. To formulate this as a variational problem on $\mathrm{Imm}$ we need a loss functional $\mathcal L \colon \mathrm{Imm}(S^1, \mathbb R^2) \to \mathbb R$ that quantifies mismatch, and is small precisely when the data agree.

Geometric data on a curve is data on $S^1$ , and most natural pieces of geometric data are themselves functions on $S^1$ : the speed $v$ is a real-valued function, the curvature vector $\vec\kappa$ is an $\mathbb R^n$ -valued function, and so on. Comparing two curves’ geometric data therefore reduces to measuring distance between functions on $S^1$ .

The natural distance on $C^\infty(S^1, \mathbb R)$ is the $L^2$ distance,

d_{L^2}(f, g) \;=\; \left(\int_{S^1} (f - g)^2\, d\theta\right)^{1/2},

and on $\mathbb R^k$ -valued functions, with the $\mathbb R^k$ inner product replacing the absolute value,

d_{L^2}(f, g) \;=\; \left(\int_{S^1} |f - g|_{\mathbb R^k}^2\, d\theta\right)^{1/2}.

Given any function-valued geometric quantity $\Phi$ on $\mathrm{Imm}$ , comparing $\Phi(\gamma)$ to $\Phi(\gamma_0)$ via this distance gives a candidate loss $\mathcal L(\gamma) = d_{L^2}(\Phi(\gamma), \Phi(\gamma_0))$ . In practice we work with the squared distance,

\mathcal L(\gamma) \;=\; d_{L^2}(\Phi(\gamma), \Phi(\gamma_0))^2 \;=\; \int_{S^1} |\Phi(\gamma) - \Phi(\gamma_0)|^2\, d\theta,

because the square root is non-differentiable at zero — exactly the point we want to converge to. Squaring removes that singularity without changing the location of the minimum: $\mathcal L \geq 0$ , with equality precisely when $\Phi(\gamma) = \Phi(\gamma_0)$ .

Geometric loss functions

The natural pieces of geometric data on a curve are its first and second fundamental forms, and we get a separate loss for each.

The intrinsic data is the metric, given pointwise by the speed $v$ , a real-valued function on $S^1$ . The corresponding loss is the squared $L^2$ distance between speeds,

\mathcal L_v(\gamma) \;=\; \int_{S^1} (v_\gamma - v_0)^2\, d\theta.

The extrinsic data is the curvature vector $\vec\kappa$ , but here we hit a wrinkle: $\vec\kappa_\gamma$ takes values in $\mathbb R^2$ while $\vec\kappa_0$ takes values in $\mathbb R^n$ , and the two cannot be directly compared. The natural scalar to compare instead is the magnitude $|\vec\kappa|$ , which lives in $\mathbb R$ regardless of the ambient — and which we recognize from §1 as the first generalized curvature in the Frenet–Serret frame, the total bendiness of the curve at each point. In the planar case alone we’d have access to the signed scalar curvature $\kappa_1$ , but since $|\vec\kappa_0|$ is intrinsically unsigned (a magnitude in $\mathbb R^n$ ), the comparison forces us to magnitudes on both sides; mirror-image solutions are then equivalent under $\mathcal L_\kappa$ , which we accept as a feature of working ambient-blind. The corresponding loss is

\mathcal L_\kappa(\gamma) \;=\; \int_{S^1} (|\vec\kappa_\gamma| - |\kappa_0|)^2\, d\theta.

Each of these is a smooth function $\mathrm{Imm}(S^1, \mathbb R^2) \to \mathbb R$ that we can use as a loss in its own right. To match both kinds of data simultaneously, we take a weighted sum,

\mathcal L(\gamma) \;=\; w_v\, \mathcal L_v(\gamma) \;+\; w_\kappa\, \mathcal L_\kappa(\gamma),

with $w_v, w_\kappa \geq 0$ trading off how strongly we insist on each kind of match.

Gradient flows

With a loss $\mathcal L$ on $\mathrm{Imm}(S^1, \mathbb R^2)$ in hand, an optimization strategy is a recipe for moving through $\mathrm{Imm}$ in a way that decreases $\mathcal L$ . Each such strategy gives an ODE on $\mathrm{Imm}$ (or on its tangent bundle, for second-order schemes). Below we describe the two most natural ones — gradient flow and gradient flow with momentum — and then specialize to the metric-matching case where the gradient is short to compute by hand.

Gradient flow

On any Riemannian manifold $(M, G)$ with smooth $\mathcal F \colon M \to \mathbb R$ , the gradient flow of $\mathcal F$ is the first-order ODE

\dot \gamma \;=\; -\mathrm{grad}\,\mathcal F.

The energy decreases monotonically along solutions:

\frac{d}{dt}\mathcal F(\gamma_t) \;=\; d\mathcal F_{\gamma_t}(\dot\gamma_t) \;=\; G_{\gamma_t}\bigl(\mathrm{grad}\,\mathcal F,\, \dot\gamma_t\bigr) \;=\; -G_{\gamma_t}\bigl(\mathrm{grad}\,\mathcal F,\, \mathrm{grad}\,\mathcal F\bigr) \;\leq\; 0,

with equality exactly when $\mathrm{grad}\,\mathcal F = 0$ , i.e., at critical points of $\mathcal F$ . Critical points are the equilibria of the flow.

A common refinement is gradient descent with momentum, the second-order ODE

\nabla_t \dot \gamma \;=\; -\mathrm{grad}\,\mathcal F \;-\; \beta\, \dot\gamma,

where $\nabla_t$ is the covariant derivative along the time-parametrized curve $\gamma_t$ associated to the Levi-Civita connection of $G$ , and $\beta > 0$ is a damping coefficient. Geometrically, covariant acceleration equals force minus damping. The damping term dissipates energy and drives trajectories toward critical points of $\mathcal F$ .

In our setting $(M, G) = (\mathrm{Imm}(S^1, \mathbb R^2), G^\theta)$ and $\mathcal F = \mathcal L$ . The space $\mathrm{Imm}$ is flat with respect to $G^\theta$ — an open subset of the Hilbert space $C^\infty(S^1, \mathbb R^n)$ — so the covariant derivative is the ordinary time derivative, and the two flows read

\dot\gamma \;=\; -\mathrm{grad}\,\mathcal L, \qquad \ddot\gamma \;=\; -\mathrm{grad}\,\mathcal L \;-\; \beta\,\dot\gamma.

This is one practical advantage of working in the $d\theta$ -metric — the momentum equation has no covariant-derivative subtleties to compute around.

Specializing to the metric

Let’s work this out concretely for the case that we want to match the intrinsic metric. The loss is then $\mathcal L(\gamma) = w_v\int_{S^1}(v - v_0)^2\, d\theta$ , and to run either optimization strategy we need the gradient.

Differentiating $\mathcal L$ at $\gamma + \epsilon h$ — the integration variable $\theta$ and the variation parameter $\epsilon$ are independent, so $\frac{d}{d\epsilon}|_0$ pulls inside the integral —

d\mathcal L_\gamma(h) \;=\; 2 w_v\int_{S^1} (v - v_0)\, dv_\gamma(h)\, d\theta.

Substituting $dv_\gamma(h) = \langle T, h'\rangle$ from §2 and writing the integrand using the $\mathbb R^n$ -valued function $A := (v - v_0)\, T$ ,

d\mathcal L_\gamma(h) \;=\; 2 w_v\int_{S^1} \langle A,\, h'\rangle\, d\theta.

Integration by parts on $S^1$ has no boundary terms, so

\int_{S^1}\langle A, h'\rangle\, d\theta \;=\; -\int_{S^1}\langle A',\, h\rangle\, d\theta,

and reading off the integrand against $h$ ,

\mathrm{grad}\,\mathcal L \;=\; -2 w_v\, A' \;=\; -2 w_v\,\bigl((v - v_0)\, T\bigr)'.

The gradient flow is then

\dot\gamma \;=\; 2\, w_v\, \bigl((v - v_0)\, T\bigr)',

and expanding the derivative using $T' = v\,\vec\kappa$ ,

\dot\gamma \;=\; 2\, w_v\, (v - v_0)'\, T \;+\; 2\, w_v\, v\,(v - v_0)\,\vec\kappa.

Two pieces: a tangential flow proportional to the $\theta$ -derivative of the speed mismatch, redistributing arclength along the curve; and a normal flow proportional to the speed mismatch itself, in the direction of the curvature vector — the curve bends inward where it is too long and outward where it is too short.

To see the analytic shape of this equation, substitute $T = \gamma'/v$ and $\vec\kappa = \gamma''/v^2 - \gamma'\, v'/v^3$ . After collecting terms,

\dot\gamma \;=\; 2\, w_v\left[\frac{(v - v_0)'}{v}\, \gamma' \;-\; \frac{(v - v_0)\, v'}{v^2}\, \gamma' \;+\; \frac{(v - v_0)}{v}\, \gamma''\right].

Now the gradient flow is exposed for what it is: a PDE in two independent variables, time $t$ and parameter $\theta$ . The dot is $\partial_t$ , applied to the time-dependent curve $\gamma_t(\theta)$ as a vector-valued function of $\theta$ ; the primes are $\partial_\theta$ , applied to $\gamma$ at fixed time. The momentum flow $\ddot\gamma = -\mathrm{grad}\,\mathcal L - \beta\dot\gamma$ adds a $\partial_t^2$ term to the left-hand side, making it a wave-like equation rather than a heat-like one.

These are infinite-dimensional partial differential equations on $\mathrm{Imm}(S^1, \mathbb R^2)$ , and we have no hope of solving them in closed form even in this metric-only special case — and the situation is strictly worse with the curvature term, where the gradient is fourth-order in $\gamma$ . To make progress we replace $\mathrm{Imm}$ by a finite-dimensional approximation and solve the corresponding finite-dimensional flow there. Two natural approximations follow: a polygonal mesh in §4, and a Fourier truncation in §5.

Discretization

To compute anything we need to replace the infinite-dimensional configuration space $\mathrm{Imm}(S^1, \mathbb R^n)$ with a finite-dimensional one. The most direct way is to sample the parameter circle: pick $N$ points evenly spaced in $\theta$ , and represent a curve by the polygon through their images.

The space of polygons

Fix $N \geq 3$ and the parameter values $\theta_i = 2\pi i / N$ for $i = 0, \ldots, N-1$ (indices mod $N$ ). The space of $N$ -gons in $\mathbb R^n$ is the open subset

\mathrm{Poly}_N(\mathbb R^n) \;=\; \bigl\{ P = (p_0, \ldots, p_{N-1}) \in (\mathbb R^n)^N \,:\, p_{i+1} \neq p_i \text{ for all } i \bigr\},

the discrete analog of the immersion condition (consecutive vertices distinct, so each edge has nonzero length). $\mathrm{Poly}_N(\mathbb R^n)$ is a finite-dimensional manifold of dimension $nN$ , with tangent space at any $P$ equal to $(\mathbb R^n)^N$ : a tangent vector is a tuple $h = (h_0, \ldots, h_{N-1})$ of variations of each vertex.

A polygon is not a smooth curve. Viewed as a parametrized map $S^1 \to \mathbb R^n$ — say, the piecewise-linear interpolation of its vertices — its derivative is piecewise constant with jumps at vertices, so the polygon is not in $C^\infty$ and hence not in $\mathrm{Imm}$ . The space $\mathrm{Poly}_N(\mathbb R^n)$ is therefore not a submanifold of $\mathrm{Imm}(S^1, \mathbb R^n)$ . It is a separate finite-dimensional manifold, related to $\mathrm{Imm}$ via the sampling map below — not contained in it.

The sampling map is the smooth surjection

\sigma_N \colon \mathrm{Imm}(S^1, \mathbb R^n) \longrightarrow \mathrm{Poly}_N(\mathbb R^n),

\sigma_N(\gamma) \;=\; \bigl(\gamma(\theta_0),\, \gamma(\theta_1),\, \ldots,\, \gamma(\theta_{N-1})\bigr).

Its differential at $\gamma$ sends a tangent vector $h \in T_\gamma\mathrm{Imm}$ to the tuple of evaluations $(h(\theta_0), \ldots, h(\theta_{N-1})) \in T_{\sigma_N(\gamma)}\mathrm{Poly}_N$ — discrete tangent vectors are simply samples of continuous ones.

Discrete first and second fundamental forms

Every geometric quantity we will need on a polygon is a smooth function of the vertex positions $p_0, \ldots, p_{N-1}$ . The discrete analogs of the geometric data from §1 — edge length, unit tangent, and curvature vector — are

\ell_i \;=\; |p_{i+1} - p_i|,

T_i \;=\; \frac{p_{i+1} - p_i}{\ell_i} \;\in\; \mathbb R^n,

\vec\kappa_i \;=\; \frac{T_i - T_{i-1}}{\bar\ell_i}, \qquad \bar\ell_i \;=\; \frac{\ell_{i-1} + \ell_i}{2}.

$\ell_i$ is the discrete first fundamental form, encoding the polygon’s intrinsic metric one edge at a time. $\vec\kappa_i$ is the discrete curvature vector, the finite-difference analog of $\nabla_s T = T'/v$ centered at vertex $i$ , and is the discrete extrinsic data; the denominator $\bar\ell_i$ is the discrete analog of $v$ at the vertex, appearing here because curvature is rate-of-change-of-tangent per unit arclength. These are the building blocks of the geometric loss functions we discretize next.

Discrete metric, loss, and gradient

With $N$ sample points evenly spaced in $\theta$ , the parameter spacing is $\Delta\theta = 2\pi/N$ , and the discrete analog of $\int_{S^1} f(\theta)\, d\theta$ is the Riemann sum $\frac{2\pi}{N}\sum_i f_i$ . So the continuous $d\theta$ -metric $G^\theta_\gamma(h, k) = \int \langle h, k\rangle\, d\theta$ becomes, under sampling,

G^\theta_P(h, k) \;=\; \frac{2\pi}{N}\sum_{i=0}^{N-1} \langle h_i, k_i\rangle,

the standard Euclidean inner product on $(\mathbb R^n)^N$ up to overall scale. As in the continuous case, this metric is independent of $P$ , so $\mathrm{Poly}_N$ is flat in $G^\theta$ .

The real degrees of freedom of $\mathrm{Poly}_N$ are the vertex positions $p_0, \ldots, p_{N-1}$ . Any function on $\mathrm{Poly}_N$ — including any energy we want to optimize — is ultimately a function of those positions, and any gradient is a derivative with respect to them. The discrete geometric data introduced in §4.2 (lengths $\ell_i$ , tangents $T_i$ , curvature vectors $\vec\kappa_i$ ) are intermediate quantities, useful for writing geometric energies in a way that matches the continuous picture, but the differentiation always reduces to the chain rule applied to the vertex coordinates.

This guides us at every step. Rather than discretize the continuous gradient term-by-term — which would force decisions about how to evaluate edge-based quantities like $T$ at vertices, and would need a separate convergence analysis — we discretize the energy and then differentiate it directly.

For our problem the discrete loss is

\mathcal L(P) \;=\; w_v \sum_{i=0}^{N-1} \bigl(\ell_i - \ell_i^0\bigr)^2 \;+\; w_\kappa \sum_{i=0}^{N-1} \bigl(|\vec\kappa_i| - |\kappa_i^0|\bigr)^2,

where $\ell_i, |\vec\kappa_i|$ are computed from the candidate $P$ and $\ell_i^0, |\vec\kappa_i^0|$ from the target $P^0 = \sigma_N(\gamma_0)$ . This matches the continuous loss in §3 up to an overall factor from the $\Delta\theta$ scaling, which we absorb into the weights $w_v, w_\kappa$ . Both $\ell_i$ and $|\vec\kappa_i|$ are smooth functions of vertex positions on the open subset where edges have nonzero length, so $\mathcal L$ is smooth on $\mathrm{Poly}_N$ .

The differential $d\mathcal L_P \colon (\mathbb R^n)^N \to \mathbb R$ and the gradient $\mathrm{grad}\,\mathcal L \in (\mathbb R^n)^N$ are now finite-dimensional objects: $\mathrm{grad}\,\mathcal L$ is determined by the standard inner product via $G^\theta_P(\mathrm{grad}\,\mathcal L, h) = d\mathcal L_P(h)$ , and computing it amounts to taking partial derivatives with respect to vertex coordinates. The gradient flow ODE

\dot P \;=\; -\mathrm{grad}\,\mathcal L

is an ordinary system of $nN$ coupled ODEs in the vertex coordinates.

Specializing to springs

Look just at the loss function for the first fundamental form, meaning that we are trying to get the intrinsic metric to match our parameterized curve $\gamma_0\colon\mathbb{S}^1\to\mathbb{R}^n$ .

\mathcal L(P) \;=\; w_v \sum_i (\ell_i - \ell_i^0)^2,

This is quadratic in the edge lengths of our polygonal approximation, minus the corresponding “true” edge lengths, coming from $\gamma_0$ . Such quadratic energy functions are exactly the spring potentials of classical physics, so this represents a cycle of springs with rest lengths $\ell_i^0$ and spring constant $2w_v$ . Thus, the simplest finite-dimensional version of the matching problem is a spring system: build a polygon whose edges are springs with prescribed equilibrium lengths inherited from the target curve, and let it relax.

To compute the gradient flow, start with $\ell_i^2 = |p_{i+1} - p_i|^2$ . Differentiating with respect to a vertex variation $h$ ,

d\ell_i(h) \;=\; \langle T_i, h_{i+1} - h_i\rangle, \qquad d\mathcal L_P(h) \;=\; 2 w_v \sum_i (\ell_i - \ell_i^0)\, \langle T_i, h_{i+1} - h_i\rangle.

Reorganize the sum so each $h_j$ collects its coefficient,

d\mathcal L_P(h) \;=\; 2 w_v \sum_j \bigl\langle (\ell_{j-1} - \ell_{j-1}^0)\, T_{j-1} - (\ell_j - \ell_j^0)\, T_j,\; h_j \bigr\rangle.

Reading this off against $G^\theta_P(\mathrm{grad}\,\mathcal L, h) = \frac{2\pi}{N}\sum_j \langle (\mathrm{grad}\,\mathcal L)_j, h_j\rangle$ ,

(\mathrm{grad}\,\mathcal L)_j \;=\; \frac{N w_v}{\pi}\Bigl[(\ell_{j-1} - \ell_{j-1}^0)\, T_{j-1} \;-\; (\ell_j - \ell_j^0)\, T_j\Bigr].

The gradient flow is then

\dot p_j \;=\; -\frac{N w_v}{\pi}\Bigl[(\ell_{j-1} - \ell_{j-1}^0)\, T_{j-1} \;-\; (\ell_j - \ell_j^0)\, T_j\Bigr],

which up to the overall constant is exactly Newton’s law for a chain of Hookean springs: each vertex $p_j$ is pulled by two forces, one from each adjacent edge, with a stretched edge (with $\ell > \ell^0$ ) pulling its endpoints together along the tangent and a compressed edge pushing them apart.

Specializing to the bending energy

We still need to do the analogous computation for $\mathcal L_\kappa$ : write the bending loss as a function of vertex positions, compute its gradient, and write down the resulting flow. TODO. Each $\vec\kappa_i$ depends on $p_{i-1}, p_i, p_{i+1}$ , so the gradient at vertex $j$ has contributions from $i = j-1, j, j+1$ — three-term sparsity instead of the bidiagonal sparsity of the spring case.

Finite-dimensionalization by Fourier

Discretization replaced $\mathrm{Imm}$ by a separate finite-dimensional manifold $\mathrm{Poly}_N$ whose elements were polygons — not in $\mathrm{Imm}$ at all. A different approach is to keep working with smooth curves, but restrict to those of bounded frequency. This gives a finite-dimensional submanifold of $\mathrm{Imm}$ .

The Fourier submanifold

Identify $\mathbb R^2 \cong \mathbb C$ via $(x, y) \leftrightarrow x + iy$ , so a planar curve $\gamma \colon S^1 \to \mathbb R^2$ becomes a $\mathbb C$ -valued function on $S^1$ with Fourier expansion

\gamma(\theta) \;=\; \sum_{k \in \mathbb Z} c_k\, e^{ik\theta}, \qquad c_k \in \mathbb C.

There is no reality condition on the $c_k$ because $\gamma$ takes values in $\mathbb C$ , not $\mathbb R$ .

Fix $K \geq 1$ and a translation gauge $c_0 = 0$ (centering the curve at the origin; the loss is translation-invariant, so $c_0$ is a flat direction we lose nothing by fixing). The Fourier truncation at level $K$ is

\mathcal F_K \;=\; \biggl\{\, \gamma \in \mathrm{Imm}(S^1, \mathbb R^2) \;:\; \gamma(\theta) = \sum_{0 < |k| \leq K} c_k\, e^{ik\theta}\,\biggr\}.

The coefficient tuple $(c_k)_{0 < |k| \leq K} \in \mathbb C^{2K}$ parametrizes $\mathcal F_K$ , so $\mathcal F_K$ is a $2K$ -complex-dimensional ( $4K$ -real-dimensional) manifold. We will use these coefficients as our coordinates throughout: a point of $\mathcal F_K$ is the tuple

\mathbf c \;=\; (c_{-K},\, \ldots,\, c_{-1},\, c_{1},\, \ldots,\, c_K) \;\in\; \mathbb C^{2K},

which we abbreviate $(c_k)$ when the index range is clear, and every quantity we compute — the curve $\gamma$ , its derivatives, the loss, its gradient — will be expressed as a function of $(c_k)$ .

Unlike $\mathrm{Poly}_N$ , every element of $\mathcal F_K$ is genuinely a smooth curve, so the inclusion

\mathcal F_K \;\hookrightarrow\; \mathrm{Imm}(S^1, \mathbb R^2)

is honest. It is an embedding wherever $\gamma'(\theta) \neq 0$ for all $\theta$ — the immersion condition.

The tangent space at any $\gamma \in \mathcal F_K$ consists of variations within $\mathcal F_K$ , i.e., Fourier polynomials of the same form,

T_\gamma \mathcal F_K \;\cong\; \mathbb C^{2K}, \qquad h(\theta) \;=\; \sum_{0 < |k| \leq K} h_k\, e^{ik\theta}.

Geometric data in Fourier coordinates

Differentiation in $\theta$ is diagonal in Fourier coordinates: each mode $e^{ik\theta}$ is an eigenfunction of $\partial_\theta$ with eigenvalue $ik$ . So

\gamma'(\theta) \;=\; \sum_{0 < |k| \leq K} ik\, c_k\, e^{ik\theta}, \qquad \gamma''(\theta) \;=\; -\sum_{0 < |k| \leq K} k^2\, c_k\, e^{ik\theta}.

In our coordinates, $\partial_\theta$ is therefore the diagonal linear map on $\mathbb C^{2K}$

\partial_\theta \colon (c_k) \;\longmapsto\; (ik\, c_k),

and $\partial_\theta^2$ is the diagonal map $(c_k) \mapsto (-k^2\, c_k)$ . So $\gamma$ , $\gamma'$ , and $\gamma''$ are all linear functions of the coordinate vector $(c_k)$ .

The geometric data we ultimately want — the speed $v = |\gamma'|$ and the curvature vector $\vec\kappa$ — are not linear or even polynomial in the $c_k$ , because they involve square roots and quotients. For instance,

v(\theta)^2 \;=\; \langle \gamma'(\theta), \gamma'(\theta)\rangle \;=\; \gamma'(\theta)\, \overline{\gamma'(\theta)}

is a Fourier polynomial in $\theta$ (a finite trigonometric sum) and a polynomial of degree 2 in the $c_k$ when expanded — but $v$ itself, the square root, is neither. Curvature is worse: $\vec\kappa = \gamma''/v^2 - \gamma'\, v'/v^3$ has $v$ -dependent denominators throughout.

Metric, loss, and gradient

Given a curve $\gamma \in \mathcal F_K$ and two tangent vectors $h, k \in T_\gamma \mathcal F_K$ — themselves Fourier polynomials,

h(\theta) = \sum_{0 < |m| \leq K} h_m\, e^{im\theta}, \qquad k(\theta) = \sum_{0 < |n| \leq K} k_n\, e^{in\theta}

— we want to evaluate the $d\theta$ -metric

G^\theta_\gamma(h, k) \;=\; \int_{S^1} \langle h(\theta),\, k(\theta)\rangle_{\mathbb R^2}\, d\theta.

Identifying $\mathbb R^2 \cong \mathbb C$ , the $\mathbb R^2$ inner product becomes $\langle h, k\rangle_{\mathbb R^2} = \mathrm{Re}(h\,\overline{k})$ . Substituting the Fourier expansions and expanding,

\mathrm{Re}\bigl(h(\theta)\,\overline{k(\theta)}\bigr) \;=\; \mathrm{Re}\!\left(\sum_{m, n} h_m\, \overline{k_n}\, e^{i(m-n)\theta}\right).

Integrating over $S^1$ and using the orthogonality relation $\int_{S^1} e^{i(m-n)\theta}\, d\theta = 2\pi\, \delta_{mn}$ , only the diagonal terms $m = n$ survive:

G^\theta_\gamma(h, k) \;=\; 2\pi \sum_{0 < |m| \leq K} \mathrm{Re}(h_m\, \overline{k_m}).

Two things to read off this formula. First, the right-hand side does not depend on $\gamma$ at all — the metric is the same on every tangent space, so $\mathcal F_K$ is flat in $G^\theta$ , just as $\mathrm{Poly}_N$ was. Second, up to the overall factor $2\pi$ , this is the standard Euclidean inner product on $\mathbb C^{2K}$ : writing $c_k = a_k + i b_k$ in real coordinates, $\mathrm{Re}(h_m \overline{k_m}) = \mathrm{Re}(h_m)\mathrm{Re}(k_m) + \mathrm{Im}(h_m)\mathrm{Im}(k_m)$ , and the sum becomes the dot product on $\mathbb R^{4K}$ . Fourier coordinates put a curve in $\mathcal F_K$ on the same footing as a point in Euclidean space.

The loss $\mathcal L = w_v \mathcal L_v + w_\kappa \mathcal L_\kappa$ from §3 restricts to a smooth function on $\mathcal F_K$ , expressible as a function of the coordinates $(c_k)$ . Its differential at $\gamma$ is the linear map $d\mathcal L_\gamma \colon \mathbb C^{2K} \to \mathbb R$ ,

d\mathcal L_\gamma(h) \;=\; \sum_{0 < |k| \leq K}\left(\frac{\partial \mathcal L}{\partial a_k}\,\mathrm{Re}(h_k) + \frac{\partial \mathcal L}{\partial b_k}\,\mathrm{Im}(h_k)\right),

and the gradient $\mathrm{grad}\,\mathcal L \in \mathbb C^{2K}$ is determined by $G^\theta(\mathrm{grad}\,\mathcal L, h) = d\mathcal L_\gamma(h)$ . Because the metric is Euclidean up to the factor $2\pi$ , the gradient is the ordinary partial-derivative vector, divided by $2\pi$ :

(\mathrm{grad}\,\mathcal L)_k \;=\; \frac{1}{2\pi}\left(\frac{\partial \mathcal L}{\partial a_k} \,+\, i\,\frac{\partial \mathcal L}{\partial b_k}\right).

The gradient flow is then an ordinary system of $4K$ coupled real ODEs in the coefficients,

\dot c_k \;=\; -(\mathrm{grad}\,\mathcal L)_k \;=\; -\frac{1}{2\pi}\left(\frac{\partial \mathcal L}{\partial a_k} \,+\, i\,\frac{\partial \mathcal L}{\partial b_k}\right), \qquad 0 < |k| \leq K,

which we can run as soon as we know how to compute the partials of $\mathcal L$ with respect to the real and imaginary parts of $c_k$ .

Evaluating the loss numerically

Any loss we work with on $\mathcal F_K$ has the form $\mathcal L(\gamma) = \int_{S^1} F(\theta)\, d\theta$ , where the integrand $F(\theta)$ is built pointwise from $\gamma$ and its derivatives at $\theta$ . For instance, the metric loss has $F(\theta) = w_v(v(\theta) - v_0(\theta))^2$ , and the bending loss has $F(\theta) = w_\kappa(|\vec\kappa(\theta)| - |\kappa_0(\theta)|)^2$ .

We approximate the integral by the trapezoidal rule on $M$ uniform sample points $\theta_n = 2\pi n/M$ :

\mathcal L(\gamma) \;\approx\; \frac{2\pi}{M}\sum_{n=0}^{M-1} F(\theta_n).

The trapezoidal rule is exceptionally accurate for smooth periodic integrands — its error decays faster than any polynomial in $1/M$ , so a modest $M$ gives a very accurate integral. The reason is Fourier-analytic: the trapezoidal rule with $M$ uniform points integrates each basis function $e^{ik\theta}$ exactly for $|k| < M$ , so the only source of error comes from Fourier modes with $|k| \geq M$ , and these decay rapidly for smooth $F$ .

We choose $M \geq 2K + 1$ so that no aliasing occurs in $\gamma$ , $\gamma'$ , $\gamma''$ — these are Fourier polynomials of degree $\leq K$ , and $2K+1$ uniform samples suffice to determine each of them exactly. In practice we take $M$ a bit larger (say $M = 4K$ ) to better resolve the nonlinear combinations the integrand $F$ involves.

The procedure for evaluating $\mathcal L$ at a given point $(c_k) \in \mathbb C^{2K}$ is then:

Compute the coefficient vectors $(c_k)$ , $(ik\, c_k)$ , $(-k^2\, c_k)$ for $\gamma$ , $\gamma'$ , $\gamma''$ .
Evaluate $\gamma(\theta_n) = \sum_k c_k\, e^{ik\theta_n}$ , and similarly for $\gamma'$ , $\gamma''$ , at the $M$ sample points.
Compute the integrand $F(\theta_n)$ from these values — this depends on the specific loss.
Sum: $\mathcal L \approx \frac{2\pi}{M}\sum_n F(\theta_n)$ .

The gradient $\partial \mathcal L/\partial a_k, \partial \mathcal L/\partial b_k$ is computed by chain rule (or automatic differentiation) through the same pipeline.

Specializing to the metric

For the metric-only case ( $w_\kappa = 0$ ), the loss is

\mathcal L_v(\gamma) \;=\; w_v \int_{S^1} \bigl(v(\theta) - v_0(\theta)\bigr)^2\, d\theta, \qquad v(\theta) = \bigl|\gamma'(\theta)\bigr|,

with $\gamma'(\theta) = \sum_{0 < |k| \leq K} ik\, c_k\, e^{ik\theta}$ .

Following the procedure above, evaluation goes:

Compute $\gamma'(\theta_n) = \sum_k ik\, c_k\, e^{ik\theta_n}$ at each sample point.
Compute $v(\theta_n) = |\gamma'(\theta_n)|$ .
Subtract the precomputed target $v_0(\theta_n)$ and square: $F(\theta_n) = (v(\theta_n) - v_0(\theta_n))^2$ .
Sum: $\mathcal L_v \approx \frac{2\pi w_v}{M}\sum_n F(\theta_n)$ .

The gradient flow is the ODE in $\mathbb C^{2K}$

\dot c_k \;=\; -(\mathrm{grad}\,\mathcal L_v)_k,

which we integrate with any standard ODE method (forward Euler, RK4, etc.).

A polynomial alternative

There is one reformulation of the metric loss worth flagging, because it eliminates the quadrature error entirely. The square root $v = |\gamma'|$ is the only thing keeping $\mathcal L_v$ from being a polynomial in $(c_k)$ . If we replace $\mathcal L_v$ with

\widetilde{\mathcal L}_v(\gamma) \;=\; \tilde w_v \int_{S^1}\bigl(v(\theta)^2 - v_0(\theta)^2\bigr)^2\, d\theta,

then since $v$ and $v_0$ are nonnegative, $v^2 = v_0^2$ if and only if $v = v_0$ , so the minimum sets agree. And $v^2 = \langle \gamma', \gamma'\rangle$ is a polynomial of degree 2 in $(c_k)$ , so $\widetilde{\mathcal L}_v$ is polynomial of degree 4 in the coefficients, and the integral can be computed in closed form by Parseval — no quadrature, no chain rule through pointwise nonlinearities.

This is a different loss from $\mathcal L_v$ , with different gradient flow dynamics, but the same minima. Whether to prefer it over $\mathcal L_v$ depends on the application.

Specializing to the bending energy

We still need to do the analogous computation for the bending loss $\mathcal L_\kappa$ in Fourier coordinates: write the integrand $F(\theta) = w_\kappa(|\vec\kappa(\theta)| - |\kappa_0(\theta)|)^2$ in terms of $(c_k)$ , evaluate it at the sample points via the procedure above, and compute the gradient through the chain rule. TODO. Unlike the metric case, there is no clean polynomial alternative — $\vec\kappa$ has $v$ -dependent denominators that no analog of $v \mapsto v^2$ removes — so quadrature is the only path here.

The geometry of a curve

The first fundamental form

The second fundamental form

Complete invariants: Frenet–Serret

The space of curves

Riemannian metrics

The dθd\thetadθ-metric

The L2L^2L2 metric

Which one we use

Derivatives

The differential

Gradients

Optimization

Geometric loss functions

Gradient flows

Gradient flow

Specializing to the metric

Discretization

The space of polygons

Discrete first and second fundamental forms

Discrete metric, loss, and gradient

Specializing to springs

Specializing to the bending energy

Finite-dimensionalization by Fourier

The Fourier submanifold

Geometric data in Fourier coordinates

Metric, loss, and gradient

Evaluating the loss numerically

Specializing to the metric

A polynomial alternative

Specializing to the bending energy

The $d\theta$ -metric

The $L^2$ metric