Embedding Circles

A case study in geometric dimension reduction

Notes from ongoing work with Fabian Lander.

In math, high-dimensional spaces are as accessible as low-dimensional ones, and so it’s quite common for things to end up naturally living in a high-dimensional home. But for humans they aren’t accessible at all — and even low-dimensional shapes that live in such worlds can be hard to understand.

So we naturally want to pluck such objects out of their high-dimensional homes and squish them down into the cramped three-dimensional world we live in. This isn’t always possible — many dimension-reduction techniques across data science aim only to capture a shadow of the real picture — but it’s natural to want to try.

For me this kind of thing comes up all the time, in particular right now in joint work with Aaron Abrams, Dave Bachmann, and Edmund Harriss on configuration spaces. The cases we care about there are spheres sitting in R6\mathbb R^6, and the geometric data we want to preserve includes both intrinsic and extrinsic information. Me and Fabian Lander hope to write some software to accomplish this. But like always, I want to start simply, and really understand things. So these notes are about the simplest analog that has all the features: given a circle in R3\mathbb R^3, can we find a ‘best geometric approximating’ embedding into R2\mathbb R^2?

The geometry of a curve

A parametrized curve in Rn\mathbb R^n is a smooth immersion

γ ⁣:S1    Rn,γ(θ)>0,\gamma \colon S^1 \;\to\; \mathbb R^n, \qquad |\gamma'(\theta)| > 0,

where S1=R/2πZS^1 = \mathbb R / 2\pi \mathbb Z is a fixed parameter circle. We do not require γ\gamma to be injective: dropping injectivity costs us nothing (gradient flow can pass through self-intersections without incident), while dropping the immersion condition would cost us a lot, since the unit tangent, arclength derivative, and curvature data below all become singular at points where γ=0\gamma' = 0.

Two parametrized curves γ\gamma and γϕ\gamma \circ \phi related by an orientation-preserving diffeomorphism ϕ\phi of S1S^1 trace out the same path at different speeds; an unparametrized curve is the equivalence class [γ][\gamma]. Within each class the arclength parametrization with γL/2π|\gamma'| \equiv L/2\pi is canonical up to the choice of basepoint and orientation.

The first fundamental form

Pulling the Euclidean inner product on Rn\mathbb R^n back along γ\gamma gives a Riemannian metric on S1S^1,

g  =  γ,Rn  =  v(θ)2dθ2,v(θ):=γ(θ).g \;=\; \gamma^* \langle \cdot, \cdot \rangle_{\mathbb R^n} \;=\; v(\theta)^2\, d\theta^2, \qquad v(\theta) := |\gamma'(\theta)|.

The function vv is the speed, the 1-form ds=vdθds = v\,d\theta is the arclength element, and

L  =  S1v(θ)dθL \;=\; \int_{S^1} v(\theta)\, d\theta

is the total length. As a metric on the fixed parameter circle, the function vv is the geometric data attached to a parametrized curve.

For an unparametrized curve nothing is left of vv except LL: any two metrics on S1S^1 with the same total length are isometric, so the only invariant of [γ][\gamma] visible to the first fundamental form is the single real number LL. The intrinsic data of an unparametrized curve is one number; the intrinsic data of a parametrized curve is a function.

The second fundamental form

The intrinsic data sees γ\gamma as an abstract Riemannian circle and is blind to how γ\gamma sits inside Rn\mathbb R^n. The extrinsic data captures the bending, and to define it cleanly we need to differentiate vector fields along γ\gamma.

A vector field along γ\gamma is a smooth section of the pullback bundle γTRn\gamma^* T\mathbb R^n — concretely, a smooth Rn\mathbb R^n-valued function on S1S^1. The first such field is the velocity γ(θ)=γθ\gamma'(\theta) = \gamma_*\partial_\theta, and normalizing gives the unit tangent

T(θ)  =  γ(θ)v(θ)    Rn.T(\theta) \;=\; \frac{\gamma'(\theta)}{v(\theta)} \;\in\; \mathbb R^n.

The pullback of the flat Euclidean connection on Rn\mathbb R^n differentiates such a vector field componentwise; rescaling by 1/v1/v converts θ\theta-derivatives into arclength derivatives,

sf  =  1vdfdθ.\nabla_s f \;=\; \frac{1}{v}\, \frac{df}{d\theta}.

The curvature vector is the arclength rate of change of the unit tangent,

κ  =  sT    Rn.\vec\kappa \;=\; \nabla_s T \;\in\; \mathbb R^n.

Differentiating T,T=1\langle T, T\rangle = 1 forces κT\vec\kappa \perp T, so κ\vec\kappa takes values in the rank-(n1)(n-1) normal bundle of γ\gamma. This is the second fundamental form of γ\gamma in the standard sense: for an immersion γ ⁣:MRn\gamma \colon M \to \mathbb R^n of any source dimension, II(X,Y)=(XRnY)\mathrm{II}(X, Y) = (\nabla^{\mathbb R^n}_X Y)^\perp is a symmetric normal-bundle-valued tensor; for a curve, TMTM has rank one and II(T,T)=κ\mathrm{II}(T, T) = \vec\kappa determines the whole tensor.

The vector field κ\vec\kappa along γ\gamma contains all the information about how γ\gamma bends in its ambient space. Together with vv it is enough to recover γ\gamma up to translation: integrating TT recovers γ\gamma from TT, and TT in turn is determined by κ\vec\kappa via T(s)=T(0)+0sκdsT(s) = T(0) + \int_0^s \vec\kappa\,ds' once a single value T(0)T(0) is fixed.

For an unparametrized curve the natural object is κ\vec\kappa as a function of arclength on R/LZ\mathbb R / L\mathbb Z, well-defined up to the basepoint and orientation gauge.

Complete invariants: Frenet–Serret

The curvature vector κ\vec\kappa has its own orientation in Rn\mathbb R^n and is not invariant under rigid motion of the ambient space. For many purposes we want isometry-invariant extrinsic data — scalar functions built from γ\gamma that are unchanged by translation and rotation of Rn\mathbb R^n.

The simplest such scalar is κ|\vec\kappa|. But this alone is not enough to determine γ\gamma up to rigid motion in n3n \geq 3: a unit circle in a plane and an appropriately scaled circular helix are non-congruent space curves with the same constant κ|\vec\kappa|. The classical resolution is the Frenet–Serret apparatus: take successive arclength derivatives T,T,T,,T(n1)T, T', T'', \ldots, T^{(n-1)} of the unit tangent and Gram–Schmidt them into an orthonormal moving frame (e1,,en)(e_1, \ldots, e_n) along γ\gamma, with e1=Te_1 = T. The structure equations of this frame,

sei  =  κi1ei1+κiei+1,\nabla_s e_i \;=\; -\kappa_{i-1}\, e_{i-1} + \kappa_i\, e_{i+1},

involve n1n - 1 scalar functions κ1,,κn1\kappa_1, \ldots, \kappa_{n-1}, the generalized curvatures. By construction κ1,,κn2>0\kappa_1, \ldots, \kappa_{n-2} > 0 (Gram–Schmidt norms), and κn1\kappa_{n-1} is signed.

Fundamental theorem. A curve γ ⁣:S1Rn\gamma \colon S^1 \to \mathbb R^n with κ1,,κn2>0\kappa_1, \ldots, \kappa_{n-2} > 0 everywhere is determined up to rigid motion of Rn\mathbb R^n by the data (v,κ1,,κn1)(v, \kappa_1, \ldots, \kappa_{n-1}) on S1S^1. For an unparametrized curve, the data (L,κ1(s),,κn1(s))(L, \kappa_1(s), \ldots, \kappa_{n-1}(s)) as functions of arclength determines [γ][\gamma] up to rigid motion.

For n=2n = 2 there is just κ1=κ\kappa_1 = \kappa, the signed scalar curvature; for n=3n = 3, κ1=κ\kappa_1 = |\vec\kappa| is the curvature and κ2=τ\kappa_2 = \tau is the torsion.

So the extrinsic geometry of a curve admits two natural packagings: the curvature vector κ\vec\kappa, which captures the bending up to translation, and the generalized curvatures (κ1,,κn1)(\kappa_1, \ldots, \kappa_{n-1}), which capture it up to full rigid motion at the cost of a more elaborate construction.

The space of curves

The geometric story of the previous section was about individual curves. To do optimization — to look for a curve with prescribed geometric data, by varying it continuously — we need to step up a level and treat the collection of all curves as a space we can do geometry on.

Two function spaces sit in the picture, both open subsets of C(S1,Rn)C^\infty(S^1, \mathbb R^n):

Emb(S1,Rn)    Imm(S1,Rn)    C(S1,Rn),\mathrm{Emb}(S^1, \mathbb R^n) \;\subset\; \mathrm{Imm}(S^1, \mathbb R^n) \;\subset\; C^\infty(S^1, \mathbb R^n),

the space of embedded curves (immersions that are also injective) inside the space of immersed curves. Both inclusions are open: injectivity and the immersion condition are both stable under small perturbations. As open subsets of C(S1,Rn)C^\infty(S^1, \mathbb R^n), both inherit the structure of an infinite-dimensional Fréchet manifold, with tangent space at any γ\gamma canonically identified with C(S1,Rn)C^\infty(S^1, \mathbb R^n) itself — vector fields along γ\gamma. We will work in Imm\mathrm{Imm}.

A Fréchet space is a complete topological vector space whose topology is given by a countable family of seminorms; the standard example, and the only one we need, is C(S1,Rn)C^\infty(S^1, \mathbb R^n) with seminorms supθθkγ\sup_\theta |\partial_\theta^k \gamma| for k=0,1,2,k = 0, 1, 2, \ldots. We need the Fréchet rather than the Banach setting because no single norm captures all derivatives of a smooth function — controlling smoothness requires the whole tower of CkC^k norms at once. A Fréchet manifold is locally modeled on Fréchet spaces in the usual way (charts, smooth transition maps); the one fact we will use is that an open subset of a Fréchet space is itself a Fréchet manifold, with tangent space at every point canonically the ambient Fréchet space.

The orientation-preserving reparametrization group Diff+(S1)\mathrm{Diff}^+(S^1) acts on both spaces by precomposition, γγϕ\gamma \mapsto \gamma \circ \phi, and the orbits are precisely the unparametrized curves. The action is free on Emb\mathrm{Emb}: an embedded curve is injective, so γϕ=γ\gamma \circ \phi = \gamma forces ϕ=id\phi = \mathrm{id}. On Imm\mathrm{Imm} it is almost free — multiply-covered curves have nontrivial stabilizer. For example, γ(θ)=(cos2θ,sin2θ)\gamma(\theta) = (\cos 2\theta, \sin 2\theta) wraps the unit circle twice, and the half-rotation ϕ(θ)=θ+π\phi(\theta) = \theta + \pi sends γ\gamma to itself. The stabilizer of a kk-fold cover is the cyclic group Z/k\mathbb Z/k generated by θθ+2π/k\theta \mapsto \theta + 2\pi/k.

Consequently EmbEmb/Diff+(S1)\mathrm{Emb} \to \mathrm{Emb}/\mathrm{Diff}^+(S^1) is a principal Diff+(S1)\mathrm{Diff}^+(S^1)-bundle, with the base a smooth (Fréchet) manifold of unparametrized embedded curves. For immersions, Imm/Diff+(S1)\mathrm{Imm}/\mathrm{Diff}^+(S^1) is an infinite-dimensional orbifold, with orbifold points at the multiply-covered curves.

Upstairs or downstairs? The base is what we geometrically care about — each point an honest unparametrized curve. But the total space is a function space, and that is what gives us Fourier expansion, gradient flow, autodiff, and pointwise comparison of two curves. We work in Imm\mathrm{Imm} and carry the reparametrization gauge along as the price.

Tangent vectors. A tangent vector at γImm\gamma \in \mathrm{Imm} is the derivative of a one-parameter family γϵ\gamma_\epsilon of curves passing through γ\gamma at ϵ=0\epsilon = 0,

h(θ)  =  ddϵϵ=0γϵ(θ).h(\theta) \;=\; \frac{d}{d\epsilon}\bigg|_{\epsilon = 0} \gamma_\epsilon(\theta).

This hh is a vector field along γ\gamma — at each parameter value θ\theta, a vector h(θ)Rnh(\theta) \in \mathbb R^n recording the infinitesimal displacement of the curve there. So the tangent space is TγImm=C(S1,Rn)T_\gamma\mathrm{Imm} = C^\infty(S^1, \mathbb R^n), the same Fréchet space the manifold is modeled on.

Inside TγImmT_\gamma\mathrm{Imm} there is a natural splitting into tangential and normal parts,

h  =  hT+h,hT=h,TT,h \;=\; h^T + h^\perp, \qquad h^T = \langle h, T\rangle T,

where hTh^T is parallel to the curve at each point and hh^\perp is perpendicular. Tangential variations are infinitesimal reparametrizations — they slide the parametrization along the existing curve without changing its image — and so generate the gauge action of Diff+(S1)\mathrm{Diff}^+(S^1) on Imm\mathrm{Imm}. Normal variations are honest geometric deformations that move the curve in Rn\mathbb R^n.

Riemannian metrics

A Riemannian metric on Imm\mathrm{Imm} is, as on any manifold, a smoothly-varying choice of inner product on each tangent space. Concretely it assigns to each γ\gamma a positive-definite symmetric bilinear form

Gγ ⁣:C(S1,Rn)×C(S1,Rn)    R.G_\gamma \colon C^\infty(S^1, \mathbb R^n) \times C^\infty(S^1, \mathbb R^n) \;\to\; \mathbb R.

There are several natural choices; we discuss two.

The dθd\theta-metric

The most direct construction is to integrate the Euclidean inner product against the parameter measure,

Gγθ(h,k)  =  S1h(θ),k(θ)Rndθ.G^\theta_\gamma(h, k) \;=\; \int_{S^1} \langle h(\theta), k(\theta)\rangle_{\mathbb R^n}\, d\theta.

The integrand depends on hh and kk but the integration weight does not depend on γ\gamma — every tangent space carries the same inner product. So Imm\mathrm{Imm} with this metric is just an open subset of the Hilbert-like space C(S1,Rn)C^\infty(S^1, \mathbb R^n) with its L2L^2 inner product, and is in particular flat: the Levi-Civita connection is the trivial one, geodesics are straight lines in C(S1,Rn)C^\infty(S^1, \mathbb R^n), and tangent vectors at different points can be identified.

The dθd\theta-metric is not reparametrization-invariant. Under γγϕ\gamma \mapsto \gamma \circ \phi, a tangent vector hh at γ\gamma corresponds to hϕh \circ \phi at γϕ\gamma \circ \phi, and

Gγϕθ(hϕ,kϕ)  =  hϕ,kϕdθ  =  h,kdθϕ    Gγθ(h,k)G^\theta_{\gamma \circ \phi}(h \circ \phi,\, k \circ \phi) \;=\; \int \langle h \circ \phi,\, k \circ \phi\rangle\, d\theta \;=\; \int \langle h,\, k\rangle\, \frac{d\theta'}{\phi'} \;\neq\; G^\theta_\gamma(h, k)

in general. So this metric is honestly a metric on the space of parametrized curves, but does not descend to the quotient Imm/Diff+(S1)\mathrm{Imm}/\mathrm{Diff}^+(S^1) — it would assign different geometry to two reparametrizations of the same unparametrized curve.

The L2L^2 metric

A reparametrization-invariant alternative integrates against the arclength element,

GγL2(h,k)  =  S1h(θ),k(θ)Rndsγ  =  S1h,kvγdθ.G^{L^2}_\gamma(h, k) \;=\; \int_{S^1} \langle h(\theta), k(\theta)\rangle_{\mathbb R^n}\, ds_\gamma \;=\; \int_{S^1} \langle h, k\rangle\, v_\gamma\, d\theta.

The integration weight now depends on γ\gamma through vγv_\gamma, so Imm\mathrm{Imm} is no longer flat in this metric. Under γγϕ\gamma \mapsto \gamma \circ \phi, the arclength element dsds also transforms, and the change-of-variables computation gives GγϕL2(hϕ,kϕ)=GγL2(h,k)G^{L^2}_{\gamma \circ \phi}(h \circ \phi,\, k \circ \phi) = G^{L^2}_\gamma(h, k) — the metric is equivariant, and descends to a Riemannian metric on the quotient (away from orbifold points). This is the right choice if we want to do geometry on unparametrized curves: it sees only what’s intrinsic to the orbit.

Which one we use

Both metrics are honest Riemannian structures on Imm\mathrm{Imm}, but they have different characters. GL2G^{L^2} is the geometric choice — it lifts a metric from the orbit space and treats reparametrizations as the gauge they are. GθG^\theta is the function-space choice — it treats Imm\mathrm{Imm} as an open subset of a Hilbert space and benefits from being flat.

For our problem, the loss L\mathcal L we’ll write down in §3 is itself parametrization-dependent: it pins down a parametrization correspondence between γ0\gamma_0 and the candidate γ\gamma. So we are not actually solving for an unparametrized curve; we are solving for a parametrized one.

Derivatives

A smooth function F ⁣:ImmR\mathcal F \colon \mathrm{Imm} \to \mathbb R has, at each γ\gamma, a differential — a linear functional on the tangent space measuring the rate of change of F\mathcal F along an infinitesimal variation. We work this out abstractly in this section, then specialize to gradients (which require a metric) and compute one example all the way through.

The differential

Fix γImm\gamma \in \mathrm{Imm} and hTγImmh \in T_\gamma\mathrm{Imm}. The differential of F\mathcal F at γ\gamma is the linear map

dFγ ⁣:TγImmR,dFγ(h)  =  ddϵϵ=0F(γ+ϵh).d\mathcal F_\gamma \colon T_\gamma \mathrm{Imm} \to \mathbb R, \qquad d\mathcal F_\gamma(h) \;=\; \frac{d}{d\epsilon}\bigg|_{\epsilon = 0} \mathcal F(\gamma + \epsilon h).

The right-hand side does not depend on which family realizing hh we use; any γϵ\gamma_\epsilon with γ0=γ\gamma_0 = \gamma and ddϵ0γϵ=h\frac{d}{d\epsilon}|_0\gamma_\epsilon = h gives the same value. The differential is defined without reference to any inner product — it is the rate of change of F\mathcal F along an infinitesimal variation, full stop.

The same construction applied to vector- or tensor-valued functions on Imm\mathrm{Imm} produces vector- or tensor-valued 1-forms, and we use dd for all of these. So for example v ⁣:ImmC(S1,R>0)v \colon \mathrm{Imm} \to C^\infty(S^1, \mathbb R_{>0}) has a differential dvγ(h)C(S1,R)dv_\gamma(h) \in C^\infty(S^1, \mathbb R), computed by taking ddϵ0\frac{d}{d\epsilon}|_0 of vγϵv_{\gamma_\epsilon}.

Example: total length. The total length functional L ⁣:ImmRL \colon \mathrm{Imm} \to \mathbb R, γS1vγdθ\gamma \mapsto \int_{S^1} v_\gamma\, d\theta, is a smooth function on Imm\mathrm{Imm}. To compute its differential at γ\gamma, evaluate on the family γ+ϵh\gamma + \epsilon h:

L(γ+ϵh)  =  S1vγ+ϵh(θ)dθ.L(\gamma + \epsilon h) \;=\; \int_{S^1} v_{\gamma + \epsilon h}(\theta)\, d\theta.

The integration variable θ\theta and the variation parameter ϵ\epsilon are independent, so ddϵ0\frac{d}{d\epsilon}|_0 pulls inside the integral as ϵ0\frac{\partial}{\partial \epsilon}|_0 on the integrand:

dLγ(h)  =  S1dvγ(h)dθ,dL_\gamma(h) \;=\; \int_{S^1} dv_\gamma(h)\, d\theta,

where dvγ(h):=ddϵ0vγ+ϵhdv_\gamma(h) := \frac{d}{d\epsilon}|_0 v_{\gamma + \epsilon h} is the pointwise differential of vv. To compute it, square both sides of v=γv = |\gamma'| and differentiate vϵ2=γ2+2ϵγ,h+ϵ2h2v_\epsilon^2 = |\gamma'|^2 + 2\epsilon \langle \gamma', h'\rangle + \epsilon^2|h'|^2 at ϵ=0\epsilon = 0: the LHS gives 2vdv(h)2v\, dv(h), the RHS gives 2γ,h2\langle \gamma', h'\rangle, and so

dvγ(h)  =  γ,hv  =  T,h.dv_\gamma(h) \;=\; \frac{\langle \gamma', h'\rangle}{v} \;=\; \langle T, h'\rangle.

Substituting back,

dLγ(h)  =  S1T,hdθ.dL_\gamma(h) \;=\; \int_{S^1} \langle T, h'\rangle\, d\theta.

This is the differential of the length functional, evaluated on any tangent vector hh.

Gradients

A differential dFγd\mathcal F_\gamma is a 1-form on Imm\mathrm{Imm} — a linear functional on the tangent space — and to turn it into a vector field, a tangent vector at each point pointing in the direction of steepest ascent, we need a metric. Given the dθd\theta-metric GG, the gradient gradF\mathrm{grad}\,\mathcal F is the unique tangent vector at each γ\gamma satisfying

Gγ(gradF,h)  =  dFγ(h)for every hTγImm.G_\gamma(\mathrm{grad}\,\mathcal F, h) \;=\; d\mathcal F_\gamma(h) \qquad \text{for every } h \in T_\gamma\mathrm{Imm}.

The differential is metric-independent; the gradient is not. We work throughout with the dθd\theta-metric, so “the gradient” means the dθd\theta-gradient unambiguously.

Example: the gradient of total length. Continuing the example, under the dθd\theta-metric the gradient is determined by gradL,hdθ=dLγ(h)\int \langle \mathrm{grad}\,L, h\rangle\, d\theta = dL_\gamma(h). Integration by parts on the parameter circle (no boundary terms, since S1S^1 is closed) gives

dLγ(h)  =  S1T,hdθ  =  S1T,hdθ,dL_\gamma(h) \;=\; \int_{S^1} \langle T, h'\rangle\, d\theta \;=\; -\int_{S^1} \langle T', h\rangle\, d\theta,

and reading off,

gradL  =  T  =  vκ.\mathrm{grad}\,L \;=\; -T' \;=\; -v\,\vec\kappa.

The gradient is the curvature vector scaled by speed and pointed inward; the gradient flow γ˙=gradL=vκ\dot\gamma = -\mathrm{grad}\,L = v\,\vec\kappa shrinks the curve along its inward normal at a rate proportional to local speed and curvature. This is the parametrization-dependent version of curve-shortening flow: the canonical reparametrization-invariant CSF is γ˙=κ\dot\gamma = \vec\kappa, arising from the GL2G^{L^2} metric instead. The factor of vv here is the price of working in GθG^\theta.

Optimization

We are given a fixed γ0Imm(S1,Rn)\gamma_0 \in \mathrm{Imm}(S^1, \mathbb R^n) and want to find a planar curve γImm(S1,R2)\gamma \in \mathrm{Imm}(S^1, \mathbb R^2) whose geometric data — first and second fundamental form — matches that of γ0\gamma_0 as closely as possible. To formulate this as a variational problem on Imm\mathrm{Imm} we need a loss functional L ⁣:Imm(S1,R2)R\mathcal L \colon \mathrm{Imm}(S^1, \mathbb R^2) \to \mathbb R that quantifies mismatch, and is small precisely when the data agree.

Geometric data on a curve is data on S1S^1, and most natural pieces of geometric data are themselves functions on S1S^1: the speed vv is a real-valued function, the curvature vector κ\vec\kappa is an Rn\mathbb R^n-valued function, and so on. Comparing two curves’ geometric data therefore reduces to measuring distance between functions on S1S^1.

The natural distance on C(S1,R)C^\infty(S^1, \mathbb R) is the L2L^2 distance,

dL2(f,g)  =  (S1(fg)2dθ)1/2,d_{L^2}(f, g) \;=\; \left(\int_{S^1} (f - g)^2\, d\theta\right)^{1/2},

and on Rk\mathbb R^k-valued functions, with the Rk\mathbb R^k inner product replacing the absolute value,

dL2(f,g)  =  (S1fgRk2dθ)1/2.d_{L^2}(f, g) \;=\; \left(\int_{S^1} |f - g|_{\mathbb R^k}^2\, d\theta\right)^{1/2}.

Given any function-valued geometric quantity Φ\Phi on Imm\mathrm{Imm}, comparing Φ(γ)\Phi(\gamma) to Φ(γ0)\Phi(\gamma_0) via this distance gives a candidate loss L(γ)=dL2(Φ(γ),Φ(γ0))\mathcal L(\gamma) = d_{L^2}(\Phi(\gamma), \Phi(\gamma_0)). In practice we work with the squared distance,

L(γ)  =  dL2(Φ(γ),Φ(γ0))2  =  S1Φ(γ)Φ(γ0)2dθ,\mathcal L(\gamma) \;=\; d_{L^2}(\Phi(\gamma), \Phi(\gamma_0))^2 \;=\; \int_{S^1} |\Phi(\gamma) - \Phi(\gamma_0)|^2\, d\theta,

because the square root is non-differentiable at zero — exactly the point we want to converge to. Squaring removes that singularity without changing the location of the minimum: L0\mathcal L \geq 0, with equality precisely when Φ(γ)=Φ(γ0)\Phi(\gamma) = \Phi(\gamma_0).

Geometric loss functions

The natural pieces of geometric data on a curve are its first and second fundamental forms, and we get a separate loss for each.

The intrinsic data is the metric, given pointwise by the speed vv, a real-valued function on S1S^1. The corresponding loss is the squared L2L^2 distance between speeds,

Lv(γ)  =  S1(vγv0)2dθ.\mathcal L_v(\gamma) \;=\; \int_{S^1} (v_\gamma - v_0)^2\, d\theta.

The extrinsic data is the curvature vector κ\vec\kappa, but here we hit a wrinkle: κγ\vec\kappa_\gamma takes values in R2\mathbb R^2 while κ0\vec\kappa_0 takes values in Rn\mathbb R^n, and the two cannot be directly compared. The natural scalar to compare instead is the magnitude κ|\vec\kappa|, which lives in R\mathbb R regardless of the ambient — and which we recognize from §1 as the first generalized curvature in the Frenet–Serret frame, the total bendiness of the curve at each point. In the planar case alone we’d have access to the signed scalar curvature κ1\kappa_1, but since κ0|\vec\kappa_0| is intrinsically unsigned (a magnitude in Rn\mathbb R^n), the comparison forces us to magnitudes on both sides; mirror-image solutions are then equivalent under Lκ\mathcal L_\kappa, which we accept as a feature of working ambient-blind. The corresponding loss is

Lκ(γ)  =  S1(κγκ0)2dθ.\mathcal L_\kappa(\gamma) \;=\; \int_{S^1} (|\vec\kappa_\gamma| - |\kappa_0|)^2\, d\theta.

Each of these is a smooth function Imm(S1,R2)R\mathrm{Imm}(S^1, \mathbb R^2) \to \mathbb R that we can use as a loss in its own right. To match both kinds of data simultaneously, we take a weighted sum,

L(γ)  =  wvLv(γ)  +  wκLκ(γ),\mathcal L(\gamma) \;=\; w_v\, \mathcal L_v(\gamma) \;+\; w_\kappa\, \mathcal L_\kappa(\gamma),

with wv,wκ0w_v, w_\kappa \geq 0 trading off how strongly we insist on each kind of match.

Gradient flows

With a loss L\mathcal L on Imm(S1,R2)\mathrm{Imm}(S^1, \mathbb R^2) in hand, an optimization strategy is a recipe for moving through Imm\mathrm{Imm} in a way that decreases L\mathcal L. Each such strategy gives an ODE on Imm\mathrm{Imm} (or on its tangent bundle, for second-order schemes). Below we describe the two most natural ones — gradient flow and gradient flow with momentum — and then specialize to the metric-matching case where the gradient is short to compute by hand.

Gradient flow

On any Riemannian manifold (M,G)(M, G) with smooth F ⁣:MR\mathcal F \colon M \to \mathbb R, the gradient flow of F\mathcal F is the first-order ODE

γ˙  =  gradF.\dot \gamma \;=\; -\mathrm{grad}\,\mathcal F.

The energy decreases monotonically along solutions:

ddtF(γt)  =  dFγt(γ˙t)  =  Gγt(gradF,γ˙t)  =  Gγt(gradF,gradF)    0,\frac{d}{dt}\mathcal F(\gamma_t) \;=\; d\mathcal F_{\gamma_t}(\dot\gamma_t) \;=\; G_{\gamma_t}\bigl(\mathrm{grad}\,\mathcal F,\, \dot\gamma_t\bigr) \;=\; -G_{\gamma_t}\bigl(\mathrm{grad}\,\mathcal F,\, \mathrm{grad}\,\mathcal F\bigr) \;\leq\; 0,

with equality exactly when gradF=0\mathrm{grad}\,\mathcal F = 0, i.e., at critical points of F\mathcal F. Critical points are the equilibria of the flow.

A common refinement is gradient descent with momentum, the second-order ODE

tγ˙  =  gradF    βγ˙,\nabla_t \dot \gamma \;=\; -\mathrm{grad}\,\mathcal F \;-\; \beta\, \dot\gamma,

where t\nabla_t is the covariant derivative along the time-parametrized curve γt\gamma_t associated to the Levi-Civita connection of GG, and β>0\beta > 0 is a damping coefficient. Geometrically, covariant acceleration equals force minus damping. The damping term dissipates energy and drives trajectories toward critical points of F\mathcal F.

In our setting (M,G)=(Imm(S1,R2),Gθ)(M, G) = (\mathrm{Imm}(S^1, \mathbb R^2), G^\theta) and F=L\mathcal F = \mathcal L. The space Imm\mathrm{Imm} is flat with respect to GθG^\theta — an open subset of the Hilbert space C(S1,Rn)C^\infty(S^1, \mathbb R^n) — so the covariant derivative is the ordinary time derivative, and the two flows read

γ˙  =  gradL,γ¨  =  gradL    βγ˙.\dot\gamma \;=\; -\mathrm{grad}\,\mathcal L, \qquad \ddot\gamma \;=\; -\mathrm{grad}\,\mathcal L \;-\; \beta\,\dot\gamma.

This is one practical advantage of working in the dθd\theta-metric — the momentum equation has no covariant-derivative subtleties to compute around.

Specializing to the metric

Let’s work this out concretely for the case that we want to match the intrinsic metric. The loss is then L(γ)=wvS1(vv0)2dθ\mathcal L(\gamma) = w_v\int_{S^1}(v - v_0)^2\, d\theta, and to run either optimization strategy we need the gradient.

Differentiating L\mathcal L at γ+ϵh\gamma + \epsilon h — the integration variable θ\theta and the variation parameter ϵ\epsilon are independent, so ddϵ0\frac{d}{d\epsilon}|_0 pulls inside the integral —

dLγ(h)  =  2wvS1(vv0)dvγ(h)dθ.d\mathcal L_\gamma(h) \;=\; 2 w_v\int_{S^1} (v - v_0)\, dv_\gamma(h)\, d\theta.

Substituting dvγ(h)=T,hdv_\gamma(h) = \langle T, h'\rangle from §2 and writing the integrand using the Rn\mathbb R^n-valued function A:=(vv0)TA := (v - v_0)\, T,

dLγ(h)  =  2wvS1A,hdθ.d\mathcal L_\gamma(h) \;=\; 2 w_v\int_{S^1} \langle A,\, h'\rangle\, d\theta.

Integration by parts on S1S^1 has no boundary terms, so

S1A,hdθ  =  S1A,hdθ,\int_{S^1}\langle A, h'\rangle\, d\theta \;=\; -\int_{S^1}\langle A',\, h\rangle\, d\theta,

and reading off the integrand against hh,

gradL  =  2wvA  =  2wv((vv0)T).\mathrm{grad}\,\mathcal L \;=\; -2 w_v\, A' \;=\; -2 w_v\,\bigl((v - v_0)\, T\bigr)'.

The gradient flow is then

γ˙  =  2wv((vv0)T),\dot\gamma \;=\; 2\, w_v\, \bigl((v - v_0)\, T\bigr)',

and expanding the derivative using T=vκT' = v\,\vec\kappa,

γ˙  =  2wv(vv0)T  +  2wvv(vv0)κ.\dot\gamma \;=\; 2\, w_v\, (v - v_0)'\, T \;+\; 2\, w_v\, v\,(v - v_0)\,\vec\kappa.

Two pieces: a tangential flow proportional to the θ\theta-derivative of the speed mismatch, redistributing arclength along the curve; and a normal flow proportional to the speed mismatch itself, in the direction of the curvature vector — the curve bends inward where it is too long and outward where it is too short.

To see the analytic shape of this equation, substitute T=γ/vT = \gamma'/v and κ=γ/v2γv/v3\vec\kappa = \gamma''/v^2 - \gamma'\, v'/v^3. After collecting terms,

γ˙  =  2wv[(vv0)vγ    (vv0)vv2γ  +  (vv0)vγ].\dot\gamma \;=\; 2\, w_v\left[\frac{(v - v_0)'}{v}\, \gamma' \;-\; \frac{(v - v_0)\, v'}{v^2}\, \gamma' \;+\; \frac{(v - v_0)}{v}\, \gamma''\right].

Now the gradient flow is exposed for what it is: a PDE in two independent variables, time tt and parameter θ\theta. The dot is t\partial_t, applied to the time-dependent curve γt(θ)\gamma_t(\theta) as a vector-valued function of θ\theta; the primes are θ\partial_\theta, applied to γ\gamma at fixed time. The momentum flow γ¨=gradLβγ˙\ddot\gamma = -\mathrm{grad}\,\mathcal L - \beta\dot\gamma adds a t2\partial_t^2 term to the left-hand side, making it a wave-like equation rather than a heat-like one.

These are infinite-dimensional partial differential equations on Imm(S1,R2)\mathrm{Imm}(S^1, \mathbb R^2), and we have no hope of solving them in closed form even in this metric-only special case — and the situation is strictly worse with the curvature term, where the gradient is fourth-order in γ\gamma. To make progress we replace Imm\mathrm{Imm} by a finite-dimensional approximation and solve the corresponding finite-dimensional flow there. Two natural approximations follow: a polygonal mesh in §4, and a Fourier truncation in §5.

Discretization

To compute anything we need to replace the infinite-dimensional configuration space Imm(S1,Rn)\mathrm{Imm}(S^1, \mathbb R^n) with a finite-dimensional one. The most direct way is to sample the parameter circle: pick NN points evenly spaced in θ\theta, and represent a curve by the polygon through their images.

The space of polygons

Fix N3N \geq 3 and the parameter values θi=2πi/N\theta_i = 2\pi i / N for i=0,,N1i = 0, \ldots, N-1 (indices mod NN). The space of NN-gons in Rn\mathbb R^n is the open subset

PolyN(Rn)  =  {P=(p0,,pN1)(Rn)N:pi+1pi for all i},\mathrm{Poly}_N(\mathbb R^n) \;=\; \bigl\{ P = (p_0, \ldots, p_{N-1}) \in (\mathbb R^n)^N \,:\, p_{i+1} \neq p_i \text{ for all } i \bigr\},

the discrete analog of the immersion condition (consecutive vertices distinct, so each edge has nonzero length). PolyN(Rn)\mathrm{Poly}_N(\mathbb R^n) is a finite-dimensional manifold of dimension nNnN, with tangent space at any PP equal to (Rn)N(\mathbb R^n)^N: a tangent vector is a tuple h=(h0,,hN1)h = (h_0, \ldots, h_{N-1}) of variations of each vertex.

A polygon is not a smooth curve. Viewed as a parametrized map S1RnS^1 \to \mathbb R^n — say, the piecewise-linear interpolation of its vertices — its derivative is piecewise constant with jumps at vertices, so the polygon is not in CC^\infty and hence not in Imm\mathrm{Imm}. The space PolyN(Rn)\mathrm{Poly}_N(\mathbb R^n) is therefore not a submanifold of Imm(S1,Rn)\mathrm{Imm}(S^1, \mathbb R^n). It is a separate finite-dimensional manifold, related to Imm\mathrm{Imm} via the sampling map below — not contained in it.

The sampling map is the smooth surjection

σN ⁣:Imm(S1,Rn)PolyN(Rn),\sigma_N \colon \mathrm{Imm}(S^1, \mathbb R^n) \longrightarrow \mathrm{Poly}_N(\mathbb R^n), σN(γ)  =  (γ(θ0),γ(θ1),,γ(θN1)).\sigma_N(\gamma) \;=\; \bigl(\gamma(\theta_0),\, \gamma(\theta_1),\, \ldots,\, \gamma(\theta_{N-1})\bigr).

Its differential at γ\gamma sends a tangent vector hTγImmh \in T_\gamma\mathrm{Imm} to the tuple of evaluations (h(θ0),,h(θN1))TσN(γ)PolyN(h(\theta_0), \ldots, h(\theta_{N-1})) \in T_{\sigma_N(\gamma)}\mathrm{Poly}_N — discrete tangent vectors are simply samples of continuous ones.

Discrete first and second fundamental forms

Every geometric quantity we will need on a polygon is a smooth function of the vertex positions p0,,pN1p_0, \ldots, p_{N-1}. The discrete analogs of the geometric data from §1 — edge length, unit tangent, and curvature vector — are

i  =  pi+1pi,\ell_i \;=\; |p_{i+1} - p_i|, Ti  =  pi+1pii    Rn,T_i \;=\; \frac{p_{i+1} - p_i}{\ell_i} \;\in\; \mathbb R^n, κi  =  TiTi1ˉi,ˉi  =  i1+i2.\vec\kappa_i \;=\; \frac{T_i - T_{i-1}}{\bar\ell_i}, \qquad \bar\ell_i \;=\; \frac{\ell_{i-1} + \ell_i}{2}.

i\ell_i is the discrete first fundamental form, encoding the polygon’s intrinsic metric one edge at a time. κi\vec\kappa_i is the discrete curvature vector, the finite-difference analog of sT=T/v\nabla_s T = T'/v centered at vertex ii, and is the discrete extrinsic data; the denominator ˉi\bar\ell_i is the discrete analog of vv at the vertex, appearing here because curvature is rate-of-change-of-tangent per unit arclength. These are the building blocks of the geometric loss functions we discretize next.

Discrete metric, loss, and gradient

With NN sample points evenly spaced in θ\theta, the parameter spacing is Δθ=2π/N\Delta\theta = 2\pi/N, and the discrete analog of S1f(θ)dθ\int_{S^1} f(\theta)\, d\theta is the Riemann sum 2πNifi\frac{2\pi}{N}\sum_i f_i. So the continuous dθd\theta-metric Gγθ(h,k)=h,kdθG^\theta_\gamma(h, k) = \int \langle h, k\rangle\, d\theta becomes, under sampling,

GPθ(h,k)  =  2πNi=0N1hi,ki,G^\theta_P(h, k) \;=\; \frac{2\pi}{N}\sum_{i=0}^{N-1} \langle h_i, k_i\rangle,

the standard Euclidean inner product on (Rn)N(\mathbb R^n)^N up to overall scale. As in the continuous case, this metric is independent of PP, so PolyN\mathrm{Poly}_N is flat in GθG^\theta.

The real degrees of freedom of PolyN\mathrm{Poly}_N are the vertex positions p0,,pN1p_0, \ldots, p_{N-1}. Any function on PolyN\mathrm{Poly}_N — including any energy we want to optimize — is ultimately a function of those positions, and any gradient is a derivative with respect to them. The discrete geometric data introduced in §4.2 (lengths i\ell_i, tangents TiT_i, curvature vectors κi\vec\kappa_i) are intermediate quantities, useful for writing geometric energies in a way that matches the continuous picture, but the differentiation always reduces to the chain rule applied to the vertex coordinates.

This guides us at every step. Rather than discretize the continuous gradient term-by-term — which would force decisions about how to evaluate edge-based quantities like TT at vertices, and would need a separate convergence analysis — we discretize the energy and then differentiate it directly.

For our problem the discrete loss is

L(P)  =  wvi=0N1(ii0)2  +  wκi=0N1(κiκi0)2,\mathcal L(P) \;=\; w_v \sum_{i=0}^{N-1} \bigl(\ell_i - \ell_i^0\bigr)^2 \;+\; w_\kappa \sum_{i=0}^{N-1} \bigl(|\vec\kappa_i| - |\kappa_i^0|\bigr)^2,

where i,κi\ell_i, |\vec\kappa_i| are computed from the candidate PP and i0,κi0\ell_i^0, |\vec\kappa_i^0| from the target P0=σN(γ0)P^0 = \sigma_N(\gamma_0). This matches the continuous loss in §3 up to an overall factor from the Δθ\Delta\theta scaling, which we absorb into the weights wv,wκw_v, w_\kappa. Both i\ell_i and κi|\vec\kappa_i| are smooth functions of vertex positions on the open subset where edges have nonzero length, so L\mathcal L is smooth on PolyN\mathrm{Poly}_N.

The differential dLP ⁣:(Rn)NRd\mathcal L_P \colon (\mathbb R^n)^N \to \mathbb R and the gradient gradL(Rn)N\mathrm{grad}\,\mathcal L \in (\mathbb R^n)^N are now finite-dimensional objects: gradL\mathrm{grad}\,\mathcal L is determined by the standard inner product via GPθ(gradL,h)=dLP(h)G^\theta_P(\mathrm{grad}\,\mathcal L, h) = d\mathcal L_P(h), and computing it amounts to taking partial derivatives with respect to vertex coordinates. The gradient flow ODE

P˙  =  gradL\dot P \;=\; -\mathrm{grad}\,\mathcal L

is an ordinary system of nNnN coupled ODEs in the vertex coordinates.

Specializing to springs

Look just at the loss function for the first fundamental form, meaning that we are trying to get the intrinsic metric to match our parameterized curve γ0 ⁣:S1Rn\gamma_0\colon\mathbb{S}^1\to\mathbb{R}^n.

L(P)  =  wvi(ii0)2,\mathcal L(P) \;=\; w_v \sum_i (\ell_i - \ell_i^0)^2,

This is quadratic in the edge lengths of our polygonal approximation, minus the corresponding “true” edge lengths, coming from γ0\gamma_0. Such quadratic energy functions are exactly the spring potentials of classical physics, so this represents a cycle of springs with rest lengths i0\ell_i^0 and spring constant 2wv2w_v. Thus, the simplest finite-dimensional version of the matching problem is a spring system: build a polygon whose edges are springs with prescribed equilibrium lengths inherited from the target curve, and let it relax.

To compute the gradient flow, start with i2=pi+1pi2\ell_i^2 = |p_{i+1} - p_i|^2. Differentiating with respect to a vertex variation hh,

di(h)  =  Ti,hi+1hi,dLP(h)  =  2wvi(ii0)Ti,hi+1hi.d\ell_i(h) \;=\; \langle T_i, h_{i+1} - h_i\rangle, \qquad d\mathcal L_P(h) \;=\; 2 w_v \sum_i (\ell_i - \ell_i^0)\, \langle T_i, h_{i+1} - h_i\rangle.

Reorganize the sum so each hjh_j collects its coefficient,

dLP(h)  =  2wvj(j1j10)Tj1(jj0)Tj,  hj.d\mathcal L_P(h) \;=\; 2 w_v \sum_j \bigl\langle (\ell_{j-1} - \ell_{j-1}^0)\, T_{j-1} - (\ell_j - \ell_j^0)\, T_j,\; h_j \bigr\rangle.

Reading this off against GPθ(gradL,h)=2πNj(gradL)j,hjG^\theta_P(\mathrm{grad}\,\mathcal L, h) = \frac{2\pi}{N}\sum_j \langle (\mathrm{grad}\,\mathcal L)_j, h_j\rangle,

(gradL)j  =  Nwvπ[(j1j10)Tj1    (jj0)Tj].(\mathrm{grad}\,\mathcal L)_j \;=\; \frac{N w_v}{\pi}\Bigl[(\ell_{j-1} - \ell_{j-1}^0)\, T_{j-1} \;-\; (\ell_j - \ell_j^0)\, T_j\Bigr].

The gradient flow is then

p˙j  =  Nwvπ[(j1j10)Tj1    (jj0)Tj],\dot p_j \;=\; -\frac{N w_v}{\pi}\Bigl[(\ell_{j-1} - \ell_{j-1}^0)\, T_{j-1} \;-\; (\ell_j - \ell_j^0)\, T_j\Bigr],

which up to the overall constant is exactly Newton’s law for a chain of Hookean springs: each vertex pjp_j is pulled by two forces, one from each adjacent edge, with a stretched edge (with >0\ell > \ell^0) pulling its endpoints together along the tangent and a compressed edge pushing them apart.

Specializing to the bending energy

We still need to do the analogous computation for Lκ\mathcal L_\kappa: write the bending loss as a function of vertex positions, compute its gradient, and write down the resulting flow. TODO. Each κi\vec\kappa_i depends on pi1,pi,pi+1p_{i-1}, p_i, p_{i+1}, so the gradient at vertex jj has contributions from i=j1,j,j+1i = j-1, j, j+1 — three-term sparsity instead of the bidiagonal sparsity of the spring case.

Finite-dimensionalization by Fourier

Discretization replaced Imm\mathrm{Imm} by a separate finite-dimensional manifold PolyN\mathrm{Poly}_N whose elements were polygons — not in Imm\mathrm{Imm} at all. A different approach is to keep working with smooth curves, but restrict to those of bounded frequency. This gives a finite-dimensional submanifold of Imm\mathrm{Imm}.

The Fourier submanifold

Identify R2C\mathbb R^2 \cong \mathbb C via (x,y)x+iy(x, y) \leftrightarrow x + iy, so a planar curve γ ⁣:S1R2\gamma \colon S^1 \to \mathbb R^2 becomes a C\mathbb C-valued function on S1S^1 with Fourier expansion

γ(θ)  =  kZckeikθ,ckC.\gamma(\theta) \;=\; \sum_{k \in \mathbb Z} c_k\, e^{ik\theta}, \qquad c_k \in \mathbb C.

There is no reality condition on the ckc_k because γ\gamma takes values in C\mathbb C, not R\mathbb R.

Fix K1K \geq 1 and a translation gauge c0=0c_0 = 0 (centering the curve at the origin; the loss is translation-invariant, so c0c_0 is a flat direction we lose nothing by fixing). The Fourier truncation at level KK is

FK  =  {γImm(S1,R2)  :  γ(θ)=0<kKckeikθ}.\mathcal F_K \;=\; \biggl\{\, \gamma \in \mathrm{Imm}(S^1, \mathbb R^2) \;:\; \gamma(\theta) = \sum_{0 < |k| \leq K} c_k\, e^{ik\theta}\,\biggr\}.

The coefficient tuple (ck)0<kKC2K(c_k)_{0 < |k| \leq K} \in \mathbb C^{2K} parametrizes FK\mathcal F_K, so FK\mathcal F_K is a 2K2K-complex-dimensional (4K4K-real-dimensional) manifold. We will use these coefficients as our coordinates throughout: a point of FK\mathcal F_K is the tuple

c  =  (cK,,c1,c1,,cK)    C2K,\mathbf c \;=\; (c_{-K},\, \ldots,\, c_{-1},\, c_{1},\, \ldots,\, c_K) \;\in\; \mathbb C^{2K},

which we abbreviate (ck)(c_k) when the index range is clear, and every quantity we compute — the curve γ\gamma, its derivatives, the loss, its gradient — will be expressed as a function of (ck)(c_k).

Unlike PolyN\mathrm{Poly}_N, every element of FK\mathcal F_K is genuinely a smooth curve, so the inclusion

FK    Imm(S1,R2)\mathcal F_K \;\hookrightarrow\; \mathrm{Imm}(S^1, \mathbb R^2)

is honest. It is an embedding wherever γ(θ)0\gamma'(\theta) \neq 0 for all θ\theta — the immersion condition.

The tangent space at any γFK\gamma \in \mathcal F_K consists of variations within FK\mathcal F_K, i.e., Fourier polynomials of the same form,

TγFK    C2K,h(θ)  =  0<kKhkeikθ.T_\gamma \mathcal F_K \;\cong\; \mathbb C^{2K}, \qquad h(\theta) \;=\; \sum_{0 < |k| \leq K} h_k\, e^{ik\theta}.

Geometric data in Fourier coordinates

Differentiation in θ\theta is diagonal in Fourier coordinates: each mode eikθe^{ik\theta} is an eigenfunction of θ\partial_\theta with eigenvalue ikik. So

γ(θ)  =  0<kKikckeikθ,γ(θ)  =  0<kKk2ckeikθ.\gamma'(\theta) \;=\; \sum_{0 < |k| \leq K} ik\, c_k\, e^{ik\theta}, \qquad \gamma''(\theta) \;=\; -\sum_{0 < |k| \leq K} k^2\, c_k\, e^{ik\theta}.

In our coordinates, θ\partial_\theta is therefore the diagonal linear map on C2K\mathbb C^{2K}

θ ⁣:(ck)    (ikck),\partial_\theta \colon (c_k) \;\longmapsto\; (ik\, c_k),

and θ2\partial_\theta^2 is the diagonal map (ck)(k2ck)(c_k) \mapsto (-k^2\, c_k). So γ\gamma, γ\gamma', and γ\gamma'' are all linear functions of the coordinate vector (ck)(c_k).

The geometric data we ultimately want — the speed v=γv = |\gamma'| and the curvature vector κ\vec\kappa — are not linear or even polynomial in the ckc_k, because they involve square roots and quotients. For instance,

v(θ)2  =  γ(θ),γ(θ)  =  γ(θ)γ(θ)v(\theta)^2 \;=\; \langle \gamma'(\theta), \gamma'(\theta)\rangle \;=\; \gamma'(\theta)\, \overline{\gamma'(\theta)}

is a Fourier polynomial in θ\theta (a finite trigonometric sum) and a polynomial of degree 2 in the ckc_k when expanded — but vv itself, the square root, is neither. Curvature is worse: κ=γ/v2γv/v3\vec\kappa = \gamma''/v^2 - \gamma'\, v'/v^3 has vv-dependent denominators throughout.

Metric, loss, and gradient

Given a curve γFK\gamma \in \mathcal F_K and two tangent vectors h,kTγFKh, k \in T_\gamma \mathcal F_K — themselves Fourier polynomials,

h(θ)=0<mKhmeimθ,k(θ)=0<nKkneinθh(\theta) = \sum_{0 < |m| \leq K} h_m\, e^{im\theta}, \qquad k(\theta) = \sum_{0 < |n| \leq K} k_n\, e^{in\theta}

— we want to evaluate the dθd\theta-metric

Gγθ(h,k)  =  S1h(θ),k(θ)R2dθ.G^\theta_\gamma(h, k) \;=\; \int_{S^1} \langle h(\theta),\, k(\theta)\rangle_{\mathbb R^2}\, d\theta.

Identifying R2C\mathbb R^2 \cong \mathbb C, the R2\mathbb R^2 inner product becomes h,kR2=Re(hk)\langle h, k\rangle_{\mathbb R^2} = \mathrm{Re}(h\,\overline{k}). Substituting the Fourier expansions and expanding,

Re(h(θ)k(θ))  =  Re ⁣(m,nhmknei(mn)θ).\mathrm{Re}\bigl(h(\theta)\,\overline{k(\theta)}\bigr) \;=\; \mathrm{Re}\!\left(\sum_{m, n} h_m\, \overline{k_n}\, e^{i(m-n)\theta}\right).

Integrating over S1S^1 and using the orthogonality relation S1ei(mn)θdθ=2πδmn\int_{S^1} e^{i(m-n)\theta}\, d\theta = 2\pi\, \delta_{mn}, only the diagonal terms m=nm = n survive:

Gγθ(h,k)  =  2π0<mKRe(hmkm).G^\theta_\gamma(h, k) \;=\; 2\pi \sum_{0 < |m| \leq K} \mathrm{Re}(h_m\, \overline{k_m}).

Two things to read off this formula. First, the right-hand side does not depend on γ\gamma at all — the metric is the same on every tangent space, so FK\mathcal F_K is flat in GθG^\theta, just as PolyN\mathrm{Poly}_N was. Second, up to the overall factor 2π2\pi, this is the standard Euclidean inner product on C2K\mathbb C^{2K}: writing ck=ak+ibkc_k = a_k + i b_k in real coordinates, Re(hmkm)=Re(hm)Re(km)+Im(hm)Im(km)\mathrm{Re}(h_m \overline{k_m}) = \mathrm{Re}(h_m)\mathrm{Re}(k_m) + \mathrm{Im}(h_m)\mathrm{Im}(k_m), and the sum becomes the dot product on R4K\mathbb R^{4K}. Fourier coordinates put a curve in FK\mathcal F_K on the same footing as a point in Euclidean space.

The loss L=wvLv+wκLκ\mathcal L = w_v \mathcal L_v + w_\kappa \mathcal L_\kappa from §3 restricts to a smooth function on FK\mathcal F_K, expressible as a function of the coordinates (ck)(c_k). Its differential at γ\gamma is the linear map dLγ ⁣:C2KRd\mathcal L_\gamma \colon \mathbb C^{2K} \to \mathbb R,

dLγ(h)  =  0<kK(LakRe(hk)+LbkIm(hk)),d\mathcal L_\gamma(h) \;=\; \sum_{0 < |k| \leq K}\left(\frac{\partial \mathcal L}{\partial a_k}\,\mathrm{Re}(h_k) + \frac{\partial \mathcal L}{\partial b_k}\,\mathrm{Im}(h_k)\right),

and the gradient gradLC2K\mathrm{grad}\,\mathcal L \in \mathbb C^{2K} is determined by Gθ(gradL,h)=dLγ(h)G^\theta(\mathrm{grad}\,\mathcal L, h) = d\mathcal L_\gamma(h). Because the metric is Euclidean up to the factor 2π2\pi, the gradient is the ordinary partial-derivative vector, divided by 2π2\pi:

(gradL)k  =  12π(Lak+iLbk).(\mathrm{grad}\,\mathcal L)_k \;=\; \frac{1}{2\pi}\left(\frac{\partial \mathcal L}{\partial a_k} \,+\, i\,\frac{\partial \mathcal L}{\partial b_k}\right).

The gradient flow is then an ordinary system of 4K4K coupled real ODEs in the coefficients,

c˙k  =  (gradL)k  =  12π(Lak+iLbk),0<kK,\dot c_k \;=\; -(\mathrm{grad}\,\mathcal L)_k \;=\; -\frac{1}{2\pi}\left(\frac{\partial \mathcal L}{\partial a_k} \,+\, i\,\frac{\partial \mathcal L}{\partial b_k}\right), \qquad 0 < |k| \leq K,

which we can run as soon as we know how to compute the partials of L\mathcal L with respect to the real and imaginary parts of ckc_k.

Evaluating the loss numerically

Any loss we work with on FK\mathcal F_K has the form L(γ)=S1F(θ)dθ\mathcal L(\gamma) = \int_{S^1} F(\theta)\, d\theta, where the integrand F(θ)F(\theta) is built pointwise from γ\gamma and its derivatives at θ\theta. For instance, the metric loss has F(θ)=wv(v(θ)v0(θ))2F(\theta) = w_v(v(\theta) - v_0(\theta))^2, and the bending loss has F(θ)=wκ(κ(θ)κ0(θ))2F(\theta) = w_\kappa(|\vec\kappa(\theta)| - |\kappa_0(\theta)|)^2.

We approximate the integral by the trapezoidal rule on MM uniform sample points θn=2πn/M\theta_n = 2\pi n/M:

L(γ)    2πMn=0M1F(θn).\mathcal L(\gamma) \;\approx\; \frac{2\pi}{M}\sum_{n=0}^{M-1} F(\theta_n).

The trapezoidal rule is exceptionally accurate for smooth periodic integrands — its error decays faster than any polynomial in 1/M1/M, so a modest MM gives a very accurate integral. The reason is Fourier-analytic: the trapezoidal rule with MM uniform points integrates each basis function eikθe^{ik\theta} exactly for k<M|k| < M, so the only source of error comes from Fourier modes with kM|k| \geq M, and these decay rapidly for smooth FF.

We choose M2K+1M \geq 2K + 1 so that no aliasing occurs in γ\gamma, γ\gamma', γ\gamma'' — these are Fourier polynomials of degree K\leq K, and 2K+12K+1 uniform samples suffice to determine each of them exactly. In practice we take MM a bit larger (say M=4KM = 4K) to better resolve the nonlinear combinations the integrand FF involves.

The procedure for evaluating L\mathcal L at a given point (ck)C2K(c_k) \in \mathbb C^{2K} is then:

  1. Compute the coefficient vectors (ck)(c_k), (ikck)(ik\, c_k), (k2ck)(-k^2\, c_k) for γ\gamma, γ\gamma', γ\gamma''.
  2. Evaluate γ(θn)=kckeikθn\gamma(\theta_n) = \sum_k c_k\, e^{ik\theta_n}, and similarly for γ\gamma', γ\gamma'', at the MM sample points.
  3. Compute the integrand F(θn)F(\theta_n) from these values — this depends on the specific loss.
  4. Sum: L2πMnF(θn)\mathcal L \approx \frac{2\pi}{M}\sum_n F(\theta_n).

The gradient L/ak,L/bk\partial \mathcal L/\partial a_k, \partial \mathcal L/\partial b_k is computed by chain rule (or automatic differentiation) through the same pipeline.

Specializing to the metric

For the metric-only case (wκ=0w_\kappa = 0), the loss is

Lv(γ)  =  wvS1(v(θ)v0(θ))2dθ,v(θ)=γ(θ),\mathcal L_v(\gamma) \;=\; w_v \int_{S^1} \bigl(v(\theta) - v_0(\theta)\bigr)^2\, d\theta, \qquad v(\theta) = \bigl|\gamma'(\theta)\bigr|,

with γ(θ)=0<kKikckeikθ\gamma'(\theta) = \sum_{0 < |k| \leq K} ik\, c_k\, e^{ik\theta}.

Following the procedure above, evaluation goes:

  1. Compute γ(θn)=kikckeikθn\gamma'(\theta_n) = \sum_k ik\, c_k\, e^{ik\theta_n} at each sample point.
  2. Compute v(θn)=γ(θn)v(\theta_n) = |\gamma'(\theta_n)|.
  3. Subtract the precomputed target v0(θn)v_0(\theta_n) and square: F(θn)=(v(θn)v0(θn))2F(\theta_n) = (v(\theta_n) - v_0(\theta_n))^2.
  4. Sum: Lv2πwvMnF(θn)\mathcal L_v \approx \frac{2\pi w_v}{M}\sum_n F(\theta_n).

The gradient flow is the ODE in C2K\mathbb C^{2K}

c˙k  =  (gradLv)k,\dot c_k \;=\; -(\mathrm{grad}\,\mathcal L_v)_k,

which we integrate with any standard ODE method (forward Euler, RK4, etc.).

A polynomial alternative

There is one reformulation of the metric loss worth flagging, because it eliminates the quadrature error entirely. The square root v=γv = |\gamma'| is the only thing keeping Lv\mathcal L_v from being a polynomial in (ck)(c_k). If we replace Lv\mathcal L_v with

L~v(γ)  =  w~vS1(v(θ)2v0(θ)2)2dθ,\widetilde{\mathcal L}_v(\gamma) \;=\; \tilde w_v \int_{S^1}\bigl(v(\theta)^2 - v_0(\theta)^2\bigr)^2\, d\theta,

then since vv and v0v_0 are nonnegative, v2=v02v^2 = v_0^2 if and only if v=v0v = v_0, so the minimum sets agree. And v2=γ,γv^2 = \langle \gamma', \gamma'\rangle is a polynomial of degree 2 in (ck)(c_k), so L~v\widetilde{\mathcal L}_v is polynomial of degree 4 in the coefficients, and the integral can be computed in closed form by Parseval — no quadrature, no chain rule through pointwise nonlinearities.

This is a different loss from Lv\mathcal L_v, with different gradient flow dynamics, but the same minima. Whether to prefer it over Lv\mathcal L_v depends on the application.

Specializing to the bending energy

We still need to do the analogous computation for the bending loss Lκ\mathcal L_\kappa in Fourier coordinates: write the integrand F(θ)=wκ(κ(θ)κ0(θ))2F(\theta) = w_\kappa(|\vec\kappa(\theta)| - |\kappa_0(\theta)|)^2 in terms of (ck)(c_k), evaluate it at the sample points via the procedure above, and compute the gradient through the chain rule. TODO. Unlike the metric case, there is no clean polynomial alternative — κ\vec\kappa has vv-dependent denominators that no analog of vv2v \mapsto v^2 removes — so quadrature is the only path here.

← All posts