Chemistry & Vibration Modes

Working out why chemists like character tables, from first principles

When I first met finite groups and their representations in an undergraduate algebra class, they felt magical but esoteric. Differential equations felt connected to the real world; character tables did not. So it came as a genuine surprise when my chemist friends started casually using them — not as curiosities, but as working tools, on the same page as obviously practical concerns, like bond angles and vibrational spectra.

I never sat down to figure out what they were actually doing. I knew, vaguely, that the symmetry of a molecule controlled its behavior, and then that the character table was how this control was encoded. But how did symmetry affect behavior? This post is my attempt to work it out from first principles.

The goal is concrete: starting from Lagrangian mechanics of a molecule near an equilibrium, we will derive the procedure chemists use to read vibrational mode structure off a character table. Along the way the various ingredients — configuration space, the kinetic metric, the molecular point group, the isotypic decomposition — will show up in the order the physics needs them.

Because I’m a mathematician, I’m going to keep things needlessly general. Instead of narrowing our focus on the real world, we work on a general Riemannian manifold (X,g)(X,g) with isometry group G=Isom(X,g)G = \operatorname{Isom}(X,g). This way we can clearly see where the geometry, the analysis and the group theory interact.

The setup

We will work with the simplest classical toy model of a molecule: atoms are point masses in our ambient space (X,g)(X, g), attracted to and repelled by one another through forces derivable from a potential energy. There is no electronic structure, no quantum mechanics — just classical particle dynamics on a Riemannian manifold. This is a less-than-honest model of real molecules, but it is enough to recover the chemists’ character-table procedure, which is what I am trying to understand.

A molecule, then, is a finite collection of point masses in (X,g)(X,g), each atom labeled by a mass mi>0m_i > 0, with no two atoms ever occupying the same point. Ordered atom positions are tracked as a point in XnX^n, but atoms of the same mass are physically indistinguishable, so swapping two of them gives the same molecule. The honest configuration space is the quotient

Q^  :=  (XnΔ)/Σ,\widehat Q \;:=\; \bigl(X^n \setminus \Delta\bigr) \big/ \Sigma,

where Δ\Delta is the diagonal (configurations with two atoms coincident) and ΣSn\Sigma \subset S_n is the group permuting atoms within each mass class. Motions of the molecule in space are curves in Q^\widehat Q.

Two structures on Q^\widehat Q drive everything that follows.

First, the kinetic metric. A moving molecule has kinetic energy 12imig(x˙i,x˙i)\tfrac12 \sum_i m_i\, g(\dot x_i, \dot x_i), which is a quadratic form on TQ^T\widehat Q. To unpack: a tangent vector at the configuration (x1,,xn)(x_1, \ldots, x_n) is a tuple (v1,,vn)(v_1, \ldots, v_n) with viTxiXv_i \in T_{x_i} X — one velocity vector per atom — and

g^((v1,,vn),(w1,,wn))  =  i=1nmig(vi,wi),\widehat{\mathbf g}\bigl((v_1, \ldots, v_n),\, (w_1, \ldots, w_n)\bigr) \;=\; \sum_{i=1}^n m_i\, g(v_i, w_i),

the mass-weighted sum of the per-atom inner products. Written as a symmetric 2-tensor on QQ, we’d say

g^=imig\widehat{\mathbf g}=\bigoplus_i m_i g

Its Σ\Sigma- and GG-invariance are immediate: permuting equal-mass atoms leaves it alone, and moving the whole molecule by an isometry of XX leaves it alone.

Second, the potential V ⁣:Q^RV \colon \widehat Q \to \mathbb R. We require VV to be GG-invariant, where GG acts on Q^\widehat Q by simultaneously moving every atom by the same isometry of XX (this is the diagonal action of GG on XnX^n, which descends to Q^\widehat Q since it commutes with Σ\Sigma). Geometrically, the energy of a molecule does not depend on where in (X,g)(X,g) it sits or how it is oriented. In practice VV depends only on pairwise geodesic distances and bond angles, which are GG-invariant by construction.

The Lagrangian is L=12g^(q˙,q˙)V(q)\mathcal{L} = \tfrac12\, \widehat{\mathbf g}(\dot q, \dot q) - V(q), and the Euler–Lagrange equations take the coordinate-free form

q˙q˙  =  gradg^V,\nabla_{\dot q} \dot q \;=\; -\operatorname{grad}_{\widehat{\mathbf g}} V,

where \nabla is the Levi–Civita connection of g^\widehat{\mathbf g}.

Everything in the rest of this post is a consequence of these few ingredients: the manifold Q^\widehat Q, the kinetic metric g^\widehat{\mathbf g}, the GG-invariant potential VV, and the induced GG-action on Q^\widehat Q.

Linearizing at an equilibrium

An equilibrium is a point e^Q^\widehat e \in \widehat Q with dVe^=0dV_{\widehat e} = 0 — a configuration at which no net force acts on any atom. We want to study small motions near such an equilibrium, and the natural first step is to linearize the Euler–Lagrange equation at e^\widehat e.

Consider a one-parameter family of trajectories near the equalibrium ee. We might parameterize such a family in tersm of a variable ss, say q(s,t)q(s,t) with q(0,t)e^q(0,t) \equiv \widehat e the equalibirum itself. Let ξ(t)=sqs=0\xi(t) = \partial_s q|_{s=0} be the small displacement at first order in ss. Since s=0s=0 is the equalibirum, for all tt this vector based at ee, so ξ(t)\xi(t) is a curve in Te^Q^T_{\widehat e}\widehat Q. This curve captures the infinitesimal behavior of our family of solutions near ee: since a vector at ee is really a collection of velocity vectors on each atom of our molecule, we can think of the path ξ(t)\xi(t) as an animation of changing velocity vectors on our equalibrium configuration, or an infinitesimal motion.

Our goal is to find the linearized equation that ξ(t)\xi(t) obeys. We get it by differentiating the Euler–Lagrange equation q˙q˙  =  gradg^V\nabla_{\dot q}\dot q \;=\; -\operatorname{grad}_{\widehat{\mathbf g}}V. Two vector fields on the image of q(s,t)q(s, t) organize the calculation. The first is T  =  tq,T \;=\; \partial_t q, which points along the solutions in our family — at each point it is the velocity of the trajectory passing through, obtained by moving in tt at fixed ss. The second is S  =  sqS \;=\; \partial_s q, which points across the family, from one solution to its neighbor — it is the displacement you see by moving in ss at fixed tt.

Thus, the fact that our family q(s,t)q(s,t) is a family of solutions, means that for each ss the Euler lagrangue equation is satisfied by

TT=gradg^V\nabla_T T = -\operatorname{grad}_{\widehat {\mathbf g}} V

From this we want to extract an equation for ξ=Ss=0\xi = S|_{s=0}. The idea is to covariantly differentiate both sides of the Euler–Lagrange equation along the variation direction SS, then evaluate at s=0s = 0. Written out, we want to compute

S(TT)s=0left side  =  S(gradV)s=0right side\underbrace{\nabla_S\, (\nabla_T T)\big|_{s=0}}_{\text{left side}} \;=\; \underbrace{-\,\nabla_S\, (\operatorname{grad} V)\big|_{s=0}}_{\text{right side}}

and see what it says about ξ\xi. Each side turns out to be something we already care about. Two facts about this setup will do all of the work.

Left side

Rearranging the definition of the Riemann curvature tensor of g^\widehat{\mathbf g},

STT  =  TST+[S,T]T+R(S,T)T.\nabla_S \nabla_T T \;=\; \nabla_T \nabla_S T \,+\, \nabla_{[S, T]}\, T \,+\, R(S, T)\, T.

The bracket term [S,T]T\nabla_{[S, T]}\, T vanishes, since [S,T]=0[S, T] = 0. So

STT  =  TST+R(S,T)T.\nabla_S \nabla_T T \;=\; \nabla_T \nabla_S T \,+\, R(S, T)\, T.

Next, we can swap the ST\nabla_S T for TS\nabla_T S. Precisely, the torsion free identity for the Levi Civita connection reads

STTS[S,T]  =  0.\nabla_S T \,-\, \nabla_T S \,-\, [S, T] \;=\; 0.

The bracket term again vanishes, leaving ST=TS\nabla_S T = \nabla_T S. Substituting,

STT  =  TTS+R(S,T)T.\nabla_S \nabla_T T \;=\; \nabla_T \nabla_T S \,+\, R(S, T)\, T.

Now evaluate at s=0s = 0. The curvature term vanishes because one of its arguments is T(0,t)=0T(0, t) = 0 (and its applied to T(0,t)=0T(0,t)=0!). What remains is TTSs=0\nabla_T \nabla_T S|_{s=0}. Along the constant curve te^t \mapsto \widehat e, covariant differentiation reduces to ordinary differentiation in the vector space Te^Q^T_{\widehat e}\widehat Q, and S(0,t)=ξ(t)S(0, t) = \xi(t). So putting it all together

S(TT)s=0left side  =  TTSs=0  =  ξ¨(t)    Te^Q^.\underbrace{\nabla_S\, (\nabla_T T)\big|_{s=0}}_{\text{left side}} \;=\;\nabla_T \nabla_T S \big|_{s=0} \;=\; \ddot \xi(t) \;\in\; T_{\widehat e}\widehat Q.

Right side

At s=0s = 0, S(0,t)=ξ(t)S(0, t) = \xi(t) and q(0,t)=e^q(0, t) = \widehat e, so S\nabla_S applied to any vector field along the family reduces at s=0s = 0 to ξ\nabla_\xi at e^\widehat e. In particular,

S(gradV)s=0  =  ξgradVe^.\nabla_S\, (\operatorname{grad} V)\big|_{s=0} \;=\; \nabla_\xi\, \operatorname{grad} V\,\big|_{\widehat e}.

The map ξξgradVe^\xi \mapsto \nabla_\xi \operatorname{grad} V|_{\widehat e} is linear in ξ\xi (because the covariant derivative is linear in its lower index), so it is a linear operator on Te^Q^T_{\widehat e}\widehat Q. We give it a name:

H ⁣:Te^Q^Te^Q^,Hξ  :=  ξgradVe^.\mathcal H \colon T_{\widehat e}\widehat Q \to T_{\widehat e}\widehat Q, \qquad \mathcal H\, \xi \;:=\; \nabla_\xi \operatorname{grad} V\,\big|_{\widehat e}.

The right side of the Euler–Lagrange equation, linearized at the equilibrium, is Hξ-\mathcal H\, \xi.

Computing H\mathcal H

We have named the operator, but at this point all we know about it is that it is linear. To get a formula — and to uncover whatever further structure H\mathcal H has — we probe it by pairing Hξ\mathcal H \xi against an arbitrary tangent vector η\eta via the kinetic metric. From the metric compatibility of the Levi–Civita connection,

g^(Hξ,η)  =  g^(ξgradV,η)  =  ξ(g^(gradV,η))g^(gradV,ξη).\begin{aligned} \widehat{\mathbf g}(\mathcal H\, \xi, \eta) &\;=\; \widehat{\mathbf g}(\nabla_\xi \operatorname{grad} V, \eta) \\ &\;=\; \xi\bigl(\widehat{\mathbf g}(\operatorname{grad} V, \eta)\bigr) \,-\, \widehat{\mathbf g}(\operatorname{grad} V,\, \nabla_\xi \eta). \end{aligned}

Evaluating at e^\widehat e does two things to this expression. The second term vanishes because gradVe^=0\operatorname{grad} V|_{\widehat e} = 0. The first term simplifies via the defining property of the gradient, g^(gradV,η)=dV(η)=η(V)\widehat{\mathbf g}(\operatorname{grad} V, \eta) = dV(\eta) = \eta(V). Together,

g^e^(Hξ,η)  =  ξ(η(V))e^.\widehat{\mathbf g}_{\widehat e}(\mathcal H\, \xi, \eta) \;=\; \xi\bigl(\eta(V)\bigr)\,\big|_{\widehat e}.

The right-hand side ξ(η(V))e^\xi(\eta(V))|_{\widehat e} has a hidden symmetry that we can extract by playing ξ\xi and η\eta off against each other. Using the bracket identity for vector fields acting on functions,

ξ(η(V))η(ξ(V))  =  [ξ,η](V)  =  dV([ξ,η]),\xi(\eta(V)) \,-\, \eta(\xi(V)) \;=\; [\xi, \eta](V) \;=\; dV([\xi, \eta]),

and the fact that dVe^=0dV_{\widehat e} = 0, we conclude ξ(η(V))e^=η(ξ(V))e^\xi(\eta(V))|_{\widehat e} = \eta(\xi(V))|_{\widehat e}. But η(ξ(V))e^\eta(\xi(V))|_{\widehat e} is exactly what our formula returns with the roles of ξ\xi and η\eta reversed: g^e^(Hη,ξ)  =  η(ξ(V))e^\widehat{\mathbf g}_{\widehat e}(\mathcal H\, \eta,\, \xi) \;=\; \eta(\xi(V))\,\big|_{\widehat e}. Chaining these together,

g^e^(Hξ,η)  =  ξ(η(V))e^  =  η(ξ(V))e^  =  g^e^(Hη,ξ)  =  g^e^(ξ,Hη),\begin{aligned} \widehat{\mathbf g}_{\widehat e}(\mathcal H\, \xi,\, \eta) &\;=\; \xi(\eta(V))\,\big|_{\widehat e} \\ &\;=\; \eta(\xi(V))\,\big|_{\widehat e} \\ &\;=\; \widehat{\mathbf g}_{\widehat e}(\mathcal H\, \eta,\, \xi) \\ &\;=\; \widehat{\mathbf g}_{\widehat e}(\xi,\, \mathcal H\, \eta), \end{aligned}

where the last step uses symmetry of the metric. Comparing the first and last entries is exactly the statement that H\mathcal H is self-adjoint with respect to the kinetic metric, a crucial property of H\mathcal H for our future computational work.

To turn the implicit equation g^e^(Hξ,η)=ξ(η(V))e^\widehat{\mathbf g}_{\widehat e}(\mathcal H\, \xi,\, \eta) = \xi(\eta(V))|_{\widehat e} into an explicit formula for Hξ\mathcal H \xi as a vector, we apply the musical isomorphism  ⁣:TQ^TQ^\sharp\colon T^*\widehat Q \to T\widehat Q — the map that takes any 1-form ω\omega to the unique vector ω\omega^\sharp with g^(ω,Y)=ω(Y)\widehat{\mathbf g}(\omega^\sharp, Y) = \omega(Y). The 1-form here is ηξ(η(V))e^\eta \mapsto \xi(\eta(V))|_{\widehat e}, and

Hξ  =  (ηξ(η(V))e^).\mathcal H\, \xi \;=\; \bigl(\eta \mapsto \xi(\eta(V))\big|_{\widehat e}\bigr)^{\sharp}.

This coordinate-free expression is somewhat abstract; the same defining relation also lets us compute H\mathcal H in any basis we like. Pick a basis {ea}\{e_a\} of Te^Q^T_{\widehat e}\widehat Q and plug ξ=ea\xi = e_a, η=eb\eta = e_b into the formula. We get a number — call it HabH_{ab} — given by

Hab  :=  g^e^(Hea,eb)  =  ea(eb(V))e^.H_{ab} \;:=\; \widehat{\mathbf g}_{\widehat e}\bigl(\mathcal H\, e_a,\, e_b\bigr) \;=\; e_a\bigl(e_b(V)\bigr)\,\big|_{\widehat e}.

If we use a coordinate basis, ea=/xae_a = \partial/\partial x^a and eb=/xbe_b = \partial/\partial x^b, the right-hand side is a familiar second partial derivative

Hab  =  2Vxaxbe^.H_{ab} \;=\; \frac{\partial^2 V}{\partial x^a\, \partial x^b}\bigg|_{\widehat e}.

So HH is just the matrix of second partial derivatives of VV at the equilibrium.

But HH is not the matrix of our operator H\mathcal H itself: Hea\mathcal H e_a sits inside a metric pairing on the left of the defining formula, so the kinetic metric is tangled with our operator. To peel off the metric and extract the matrix of H\mathcal H alone, we use self-adjointness.

Treat g^\widehat g, HH, and H\mathcal H as ordinary matrices in our basis. For column vectors ξ\xi and η\eta,

g^e^(Hξ,η)  =  (Hξ)Tg^η  =  ξTHTg^η,\widehat{\mathbf g}_{\widehat e}(\mathcal H\, \xi,\, \eta) \;=\; (\mathcal H \xi)^T\, \widehat g\, \eta \;=\; \xi^T\, \mathcal H^T \widehat g\, \eta,

and self-adjointness of H\mathcal H tells us HTg^=g^H\mathcal H^T \widehat g = \widehat g\, \mathcal H. Setting the result equal to H(ξ,η)=ξTHηH(\xi, \eta) = \xi^T H \eta gives us the matrix identity

ξTHη=ξTg^Hη,\xi^T H \eta = \xi^T \widehat g \mathcal H \eta,

and demanding it hold for all ξ\xi and η\eta implies

H  =  g^H,equivalentlyH  =  g1H.H \;=\; \widehat g\, \mathcal H , \qquad \text{equivalently} \qquad \mathcal H \;=\; g^{-1}\, H.

The matrix of the operator H\mathcal H in any basis is the inverse kinetic-metric matrix times the matrix of second partial derivatives of VV.

Putting it all together

We set out to study an infinitesimal deformation of the equilibrium solution q(t)e^q(t) \equiv \widehat e — a curve ξ(t)Te^Q^\xi(t) \in T_{\widehat e}\widehat Q describing the small motion away from rest. By covariantly differentiating both sides of the Euler–Lagrange equation along the variation direction SS and evaluating at s=0s = 0, the relation

STT  =  SgradV\nabla_S \nabla_T T \;=\; -\,\nabla_S \operatorname{grad} V

became a linear second-order ODE on the tangent space at the equilibrium,

ξ¨  +  Hξ  =  0,\ddot \xi \;+\; \mathcal H\, \xi \;=\; 0,

where H\mathcal H is the self-adjoint operator built directly from the two structures we started with — the (invese of the) kinetic metric and the (second derivatives of the) potential. Thus, the entire linear theory of molecular vibration is captured by the qualtiative behavior of ODE’s of this type.

A simplified derivation in Rn\mathbb R^n

For comparison, here is the linearization in flat space, where the calculation is much shorter. Take X=RdX = \mathbb R^d, identify Q^\widehat Q with an open subset of RN\mathbb R^N (where N=ndN = n d) once we pick a basis, treat the kinetic metric as a constant symmetric positive-definite matrix MM, and let V ⁣:RNRV \colon \mathbb R^N \to \mathbb R be the potential. The Lagrangian is

L  =  12q˙TMq˙    V(q),\mathcal L \;=\; \tfrac{1}{2}\, \dot q^{\,T} M\, \dot q \;-\; V(q),

and the Euler–Lagrange equation reads

Mq¨  =  V(q).M\, \ddot q \;=\; -\nabla V(q).

Linearize around an equilibrium e^\widehat e (so V(e^)=0\nabla V(\widehat e) = 0). Write q(t)=e^+ξ(t)q(t) = \widehat e + \xi(t) and Taylor-expand the gradient,

V(e^+ξ)  =  Hξ  +  O(ξ2),\nabla V(\widehat e + \xi) \;=\; H\, \xi \;+\; O(\xi^2),

where H=2V(e^)H = \nabla^2 V(\widehat e) is the matrix of second partial derivatives at the equilibrium. Plug in, drop the O(ξ2)O(\xi^2) terms, and multiply by M1M^{-1}:

ξ¨  +  Hξ  =  0,H  =  M1H.\ddot \xi \;+\; \mathcal H\, \xi \;=\; 0, \qquad \mathcal H \;=\; M^{-1}\, H.

Solving the linear theory

We are now ready to use the structure of H\mathcal H to solve ξ¨+Hξ=0\ddot \xi + \mathcal H\, \xi = 0.

Modes by sign of eigenvalue

Because H\mathcal H is self-adjoint with respect to the kinetic metric, the spectral theorem gives us a g^e^\widehat{\mathbf g}_{\widehat e}-orthonormal basis of Te^Q^T_{\widehat e}\widehat Q consisting of eigenvectors of H\mathcal H, with real eigenvalues.

Pick any eigenvector vv with eigenvalue ω2\omega^2, and look for solutions of the form ξ(t)=c(t)v\xi(t) = c(t)\, v where cc is a real-valued function of time. Plugging in,

c¨v+H(cv)  =  (c¨+ω2c)v,\ddot c\, v \,+\, \mathcal H(c\, v) \;=\; (\ddot c \,+\, \omega^2 c)\, v,

so ξ(t)=c(t)v\xi(t) = c(t)\, v is a solution exactly when c(t)c(t) satisfies the scalar ODE

c¨+ω2c  =  0.\ddot c \,+\, \omega^2 c \;=\; 0.

The behavior depends entirely on the sign of ω2\omega^2.

Positive eigenvalue (ω2>0\omega^2 > 0). The equation is a harmonic oscillator,

c(t)  =  Acos(ωt)+Bsin(ωt).c(t) \;=\; A \cos(\omega t) \,+\, B \sin(\omega t).

The mode wobbles back and forth in the vv direction with frequency ω\omega — bounded oscillation. These are the normal modes of the molecule, and the corresponding ω\omega are the normal frequencies.

Negative eigenvalue (ω2<0\omega^2 < 0). Writing ω2=λ2\omega^2 = -\lambda^2 with λ>0\lambda > 0, the equation becomes c¨=λ2c\ddot c = \lambda^2 c, with general solution

c(t)  =  Aeλt+Beλt.c(t) \;=\; A\, e^{\lambda t} \,+\, B\, e^{-\lambda t}.

For generic initial conditions the eλte^{\lambda t} piece dominates: an arbitrarily small displacement in the vv direction grows exponentially in time. The equilibrium is unstable along vv — geometrically, e^\widehat e is a saddle of VV in this direction, the potential going down rather than up.

Zero eigenvalue (ω2=0\omega^2 = 0). The equation reduces to c¨=0\ddot c = 0, with general solution

c(t)  =  A+Bt.c(t) \;=\; A \,+\, B\, t.

The molecule drifts at constant velocity in the vv direction. There is no restoring force, no oscillation, no exponential growth — just uniform motion. These zero modes are neither oscillations nor instabilities; they are something else, and a structural part of the story.

Putting the modes together. Decomposing an arbitrary initial condition into the eigenbasis, the general solution to ξ¨+Hξ=0\ddot \xi + \mathcal H \xi = 0 is

ξ(t)  =  αcα(t)vα,\xi(t) \;=\; \sum_\alpha c_\alpha(t)\, v_\alpha,

where each cαc_\alpha evolves independently according to its eigenvalue ωα2\omega_\alpha^2. The equilibrium e^\widehat e is stable — every small perturbation stays small — exactly when every ωα2>0\omega_\alpha^2 > 0, i.e., when H\mathcal H is positive definite. Equivalently, the matrix of second partials of VV at e^\widehat e is positive definite, and e^\widehat e is a local minimum of VV rather than a saddle or a maximum.

Real molecules sit at local minima of their potential, so we expect every nonzero ωα2\omega_\alpha^2 to be positive — every nonzero mode an oscillation. But zero eigenvalues are not avoidable: there are always zero modes, forced by a structural feature of Q^\widehat Q that we have so far ignored.

Where the zero modes come from

The structural feature we have so far ignored is that the equilibrium e^\widehat e is never isolated. Equilibria of VV on Q^\widehat Q always come in families, because VV is GG-invariant.

If V(gq)=V(q)V(g \cdot q) = V(q) for every gGg \in G, then carrying a critical point along the GG-action gives another critical point, and the full orbit

O^  :=  Ge^    Q^\widehat{\mathcal O} \;:=\; G \cdot \widehat e \;\subset\; \widehat Q

consists entirely of equilibria. This is not a pathology; it is the mathematical shadow of a fact we already accept — rigidly translating or rotating an equilibrium molecule gives another equilibrium molecule. The ambient isometry group acts trivially on the energy, so it acts non-trivially on the space of equilibria.

The tangent directions to this orbit at e^\widehat e form a linear subspace

N  :=  Te^O^    Te^Q^.N \;:=\; T_{\widehat e} \widehat{\mathcal O} \;\subset\; T_{\widehat e}\widehat Q.

These are the infinitesimal rigid motions: velocities generated by a one-parameter subgroup of GG acting on the whole molecule at once.

Along any such direction the potential is constant — it has to be, because the orbit consists entirely of equilibria and they all share the same energy. So H\mathcal H kills NN:

Hξ  =  0for all ξN.\mathcal H\, \xi \;=\; 0 \qquad \text{for all } \xi \in N.

These are exactly the zero-eigenvalue modes from the previous section. We saw what they do under the linearized dynamics: they don’t oscillate, they drift. The molecule “oscillates” along these directions at zero frequency — which is to say, it just coasts off, rigidly translating or rotating through space.

These zero modes are not interesting as vibrations. They are the imprint on Te^Q^T_{\widehat e}\widehat Q of the ambient isometry group, pure and simple. The genuine vibrational content of the molecule — the oscillatory modes we set out to compute — must live in the complementary directions.

The vibrational subspace

We need to separate the rigid-motion directions from the honest vibrations. The natural way is to take the orthogonal complement of NN in Te^Q^T_{\widehat e}\widehat Q with respect to our kinetic metric g^e^\widehat{\mathbf g}_{\widehat e} — the only inner product on Te^Q^T_{\widehat e}\widehat Q available to us. Define

V  :=  Ng^    Te^Q^,\mathcal V \;:=\; N^{\perp_{\widehat{\mathbf g}}} \;\subset\; T_{\widehat e}\widehat Q,

so that the tangent space splits orthogonally as

Te^Q^  =  NV.T_{\widehat e}\widehat Q \;=\; N \,\oplus\, \mathcal V.

The space V\mathcal V is where the real vibrational dynamics lives. Before going further it is worth checking that the operator H\mathcal H behaves nicely on the splitting Te^Q^=NVT_{\widehat e}\widehat Q = N \oplus \mathcal V — that it sends each summand into itself, and that the restriction is still self-adjoint.

That H\mathcal H preserves V\mathcal V is a consequence of self-adjointness. For any ξV\xi \in \mathcal V and any ηN\eta \in N,

g^e^(Hξ,η)  =  g^e^(ξ,Hη)  =  0,\widehat{\mathbf g}_{\widehat e}(\mathcal H\, \xi,\, \eta) \;=\; \widehat{\mathbf g}_{\widehat e}(\xi,\, \mathcal H\, \eta) \;=\; 0,

since H\mathcal H kills NN. So Hξ\mathcal H \xi pairs trivially with every element of NN, which is to say HξN=V\mathcal H \xi \in N^\perp = \mathcal V. Self-adjointness of the restriction HV\mathcal H|_{\mathcal V} then comes for free: the restriction of a self-adjoint operator to an invariant subspace, together with the restricted inner product, is again self-adjoint. So HV\mathcal H|_{\mathcal V} is a self-adjoint operator on the inner product space (V,g^e^V)(\mathcal V, \widehat{\mathbf g}_{\widehat e}|_{\mathcal V}).

Restricted to V\mathcal V the operator is also non-degenerate (assuming e^\widehat e is sufficiently generic — that we haven’t accidentally produced extra zero directions transverse to the orbit). The linearized Euler–Lagrange equation then decouples cleanly:

ξ¨0  =  0,ξ¨  +  HVξ  =  0,\ddot \xi_0 \;=\; 0, \qquad \ddot \xi_\perp \;+\; \mathcal H|_{\mathcal V}\, \xi_\perp \;=\; 0,

where ξ=ξ0+ξ\xi = \xi_0 + \xi_\perp with ξ0N\xi_0 \in N and ξV\xi_\perp \in \mathcal V. Zero modes drift; vibrational modes oscillate; the two never talk to each other.

So the problem reduces to a concrete finite-dimensional eigenvalue problem: diagonalize the self-adjoint operator HV\mathcal H|_{\mathcal V} on a vector space of dimension ndimXdimNn\dim X - \dim N. For a molecule of nn atoms in three-space, V\mathcal V is generically of dimension 3n63n - 6 — which for even modestly sized molecules is already a substantial matrix.

Self-adjointness gives us a lot for free, even before we touch the matrix entries. The spectral theorem on (V,g^e^V)(\mathcal V, \widehat{\mathbf g}_{\widehat e}|_{\mathcal V}) yields an orthogonal direct sum

V  =  λEλ\mathcal V \;=\; \bigoplus_\lambda E_\lambda

over the (real) eigenvalues λ\lambda of HV\mathcal H|_{\mathcal V}, where EλE_\lambda is the corresponding eigenspace. Each λ\lambda is a squared vibrational frequency ω2\omega^2, and the modes oscillating at frequency ω\omega span EλE_\lambda. This is the most physical decomposition of V\mathcal V available: split the modes by frequency.

Computing the eigenvalues and eigenvectors of HV\mathcal H|_{\mathcal V} explicitly requires the specific potential VV and a real diagonalization. But there are coarser questions about the spectrum that we might hope to answer without solving the eigenvalue problem at all:

Equivalently: how many distinct eigenvalues λ\lambda are there, and what are the dimensions dimEλ\dim E_\lambda? This is coarse data about the eigendecomposition — the shape of the orthogonal direct sum, not the actual eigenvalues. And it is exactly the kind of information that symmetry alone can pin down, without reference to the specific VV.

This is where the group theory comes to save the day.

Symmetry and representation theory

Why symmetry?

We already know quite a bit about HV\mathcal H|_{\mathcal V}. It is a self-adjoint operator on a finite-dimensional inner product space, so the spectral theorem hands us an orthonormal basis of eigenvectors with real eigenvalues, and the small motions of the molecule decompose into independently oscillating modes accordingly. But this is a generic structure, available for any self-adjoint operator on any finite-dimensional inner product space. It tells us the modes exist; it tells us nothing about the organization of the spectrum — how many distinct frequencies, with what multiplicities, are forced.

If we want to say more about the spectrum without knowing the specifics of the potential VV, we need a structural input beyond ”H\mathcal H is self-adjoint.” Symmetry is a natural candidate: a group acting on V\mathcal V that commutes with HV\mathcal H|_{\mathcal V} would constrain its spectrum, in cases where the symmetry is rich enough. So our plan in this section is to identify a relevant group of symmetries of our linearized system, prove that it commutes with HV\mathcal H|_{\mathcal V}, and read off the consequences.

The symmetries of an equilibrium

The natural source of symmetries is the ambient isometry group G=Isom(X,g)G = \mathrm{Isom}(X, g), which we have already met as the group preserving everything we have built. We are not interested in all of GG, though, only the part that fixes our particular equilibrium e^\widehat e. Set

P  :=  Ge^  =  {hG  :  he^=e^}.P \;:=\; G_{\widehat e} \;=\; \bigl\{\, h \in G \;:\; h \cdot \widehat e = \widehat e \,\bigr\}.

What does an element of PP look like concretely? Pick a representative (x1,,xn)Xn(x_1, \ldots, x_n) \in X^n for the equivalence class e^Q^\widehat e \in \widehat Q. Then he^=e^h \cdot \widehat e = \widehat e in Q^\widehat Q exactly when the diagonally-moved tuple (hx1,,hxn)(h \cdot x_1, \ldots, h \cdot x_n) is Σ\Sigma-equivalent to (x1,,xn)(x_1, \ldots, x_n). In other words: hh is a rigid motion of XX whose effect on the molecule’s atoms is the same as a permutation of equal-mass atom labels — there exists σhΣ\sigma_h \in \Sigma with hxi=xσh(i)h \cdot x_i = x_{\sigma_h(i)} for every ii.

So PP is the group of rigid motions of XX that send the molecular shape to itself. Each hPh \in P comes with a permutation σhΣ\sigma_h \in \Sigma, and hσhh \mapsto \sigma_h is a group homomorphism PΣP \to \Sigma.

In examples in X=R3X = \mathbb{R}^3:

Action on the tangent space, and a classification

Each hGh \in G acts on the configuration space Q^\widehat Q as a diffeomorphism

Φh ⁣:Q^Q^,q    hq,\Phi_h \colon \widehat Q \to \widehat Q, \qquad q \;\mapsto\; h \cdot q,

obtained by applying the isometry hh to every atom of the configuration. For hPh \in P, Φh\Phi_h fixes the basepoint e^\widehat e, so its differential at e^\widehat e is a well-defined linear map of the tangent space. Collecting these differentials into a single homomorphism gives a representation of PP:

ρ ⁣:P    GL(Te^Q^),h    d(Φh)e^.\rho \colon P \;\to\; \mathrm{GL}\bigl(T_{\widehat e}\widehat Q\bigr), \qquad h \;\mapsto\; d(\Phi_h)_{\widehat e}.

The representation has special structure built into how it was constructed: PP acts on Q^\widehat Q by isometries of g^\widehat{\mathbf g} (since PG=Isom(X,g)P \subset G = \mathrm{Isom}(X,g), and the descended action is isometric), and the differential of an isometry at a fixed point is itself a linear isometry of the tangent inner product space. So ρ\rho takes values in the orthogonal group of the kinetic-metric inner product:

ρ ⁣:P    O(Te^Q^,g^e^).\rho \colon P \;\to\; \mathrm{O}\bigl(T_{\widehat e}\widehat Q,\, \widehat{\mathbf g}_{\widehat e}\bigr).

We also note that ρ\rho preserves the splitting Te^Q^=NVT_{\widehat e}\widehat Q = N \oplus \mathcal V. The orbit O^=Ge^\widehat{\mathcal O} = G \cdot \widehat e is GG-invariant as a set, hence PP-invariant, so ρ(h)\rho(h) preserves N=Te^O^N = T_{\widehat e}\widehat{\mathcal O}; preserving the metric, it then preserves V=N\mathcal V = N^\perp as well. So ρ\rho restricts to an isometric representation ρV ⁣:PO(V,g^e^V)\rho|_{\mathcal V} \colon P \to \mathrm{O}(\mathcal V, \widehat{\mathbf g}_{\widehat e}|_{\mathcal V}) on the vibrational subspace.

What kind of group is PP? A priori ρ(P)\rho(P) sits inside the (large) orthogonal group O(Te^Q^)O(ndimX)\mathrm O(T_{\widehat e}\widehat Q) \cong \mathrm O(n \dim X). But the abstract group PP — sitting inside the much smaller Lie group G=Isom(X,g)G = \mathrm{Isom}(X, g) — has a more constrained shape than that.

Warmup in Rd\mathbb R^d. When X=RdX = \mathbb R^d this is easy to see directly. Place the origin at the (mass-weighted) center of mass of the molecule. Every hPh \in P permutes equal-mass atoms among themselves, so it preserves the center of mass and fixes the origin. The isometries of Rd\mathbb R^d fixing the origin are exactly O(d)\mathrm{O}(d), so

P    O(d).P \;\hookrightarrow\; \mathrm{O}(d).

For X=R3X = \mathbb R^3 this is O(3)\mathrm{O}(3). The symmetry group of any molecular equilibrium in R3\mathbb R^3 is a subgroup of O(3)\mathrm{O}(3).

This is a sharp constraint. Sometimes PP is infinite — a linear molecule like CO2\mathrm{CO}_2 or HCN\mathrm{HCN} has PP containing a continuous SO(2)\mathrm{SO}(2) of rotations about the molecular axis. But if the molecule is not collinear, PP is finite. The finite subgroups of O(3)\mathrm{O}(3) are completely classified. The finite subgroups of SO(3)\mathrm{SO}(3) are the cyclic groups Z/n\mathbb{Z}/n, the dihedral groups DnD_n (of order 2n2n), and the three Platonic rotation groups A4A_4 (tetrahedral), S4S_4 (octahedral, equivalently the rotations of a cube), and A5A_5 (icosahedral, equivalently the rotations of a dodecahedron); the finite subgroups of O(3)\mathrm{O}(3) are obtained from these by adjoining orientation-reversing elements like reflections.

So in R3\mathbb R^3 we recover exactly the symmetry classification chemists already use: the molecular point groups. The same list in chemistry notation reads CnC_n, DnD_n, TT, OO, II for the rotation-only groups, with their reflection-extended cousins CnvC_{nv}, CnhC_{nh}, SnS_n, DnhD_{nh}, DndD_{nd}, TdT_d, ThT_h, OhO_h, IhI_h.

The general case. The center-of-mass argument is genuinely Euclidean: in a curved Riemannian manifold the global Riemannian center of mass need not exist, and there is no canonical “origin” of XX to place. So we need a different route in the general setting — and fortunately the same kind of conclusion can be reached more abstractly, by identifying PP as a compact subgroup of GG and then applying a structural theorem about Lie groups.

That PP is compact follows from a standard fact in Riemannian geometry: the action of G=Isom(X,g)G = \mathrm{Isom}(X, g) on XX is proper, meaning that the (setwise) stabilizer of any compact subset of XX is itself compact in GG. The atom positions {x1,,xn}\{x_1, \ldots, x_n\} form a compact (finite) subset of XX, and any hPh \in P permutes these atoms among themselves — so PP is contained in the setwise stabilizer of {x1,,xn}\{x_1, \ldots, x_n\}, which is compact by properness. Hence PP is compact.

(In typical cases PP is in fact finite. The precise condition is that the atoms are not all contained in a codimension-2 totally geodesic submanifold of XX — informally, that the molecule is “spread out enough” in XX that no continuous family of rotations preserves it. In R3\mathbb R^3 this is the non-collinear case, since codimension 2 means a single geodesic line; a collinear molecule like CO2\mathrm{CO}_2 has all its atoms on one line, and PP then contains a continuous SO(2)\mathrm{SO}(2) of rotations about that line, giving O(2)=Cv\mathrm{O}(2) = C_{\infty v} or DhD_{\infty h} — still compact, just no longer finite. In higher-dimensional or differently shaped ambient spaces the condition adapts: in R4\mathbb R^4, a coplanar configuration similarly admits a continuous rotational stabilizer, and so on.)

Now apply a theorem of E. Cartan (the Cartan–Iwasawa–Mal’cev theorem): every compact subgroup of a Lie group GG is contained in some maximal compact subgroup of GG, and any two maximal compacts are conjugate. So PP embeds in a maximal compact of GG.

For Riemannian symmetric spaces this maximal compact has a clean identification, depending on type. For non-compact-type spaces (like Hd\mathbb H^d) and Euclidean-type spaces (Rd\mathbb R^d itself), the maximal compact of GG is exactly the point stabilizer at any chosen basepoint — so PP ends up inside that point stabilizer, which is O(d)\mathrm{O}(d) in either case. For compact-type spaces (like SdS^d), GG is itself compact and serves as its own maximal compact; PP sits inside all of GG, which is a larger rotation group (O(d+1)\mathrm O(d+1) for SdS^d). Either way PP is a compact subgroup of a finite-dimensional rotation group.

In Thurston-geometry terms: R3\mathbb R^3 and H3\mathbb H^3 give O(3)\mathrm O(3); S3S^3 gives O(4)\mathrm O(4); the smaller geometries (Nil\mathrm{Nil}, Sol\mathrm{Sol}, H2×R\mathbb H^2 \times \mathbb R, SL2~\widetilde{\mathrm{SL}_2}) give correspondingly smaller maximal compacts inside O(3)\mathrm O(3). In every case PP is a compact subgroup of a familiar rotation group, and for sufficiently spread-out molecules — those whose atoms are not all on a codimension-2 totally geodesic submanifold of XX — it is in fact finite, falling into the classification we already wrote down.

For the rest of the post we assume our equilibrium satisfies this generic condition (in R3\mathbb R^3, that the molecule is non-collinear), so PP is a finite subgroup of O(d)\mathrm{O}(d) — and the representation theory of PP is the classical theory of finite groups.

H\mathcal H is PP-equivariant

Intuition. Think of H\mathcal H as encoding the molecule’s linear restoring force: a small displacement ξ\xi from equilibrium produces a restoring force proportional to Hξ-\mathcal H \xi. Now suppose we displace not by ξ\xi but by its symmetry-image ρ(h)ξ\rho(h)\xi, for some hPh \in P. By symmetry of the molecule, the restoring force on this rotated/reflected displacement should be the same rotation/reflection of the original restoring force — that is, ρ(h)\rho(h) applied to Hξ-\mathcal H \xi. So

H(ρ(h)ξ)  =?  ρ(h)(Hξ)\mathcal H \bigl(\rho(h)\xi\bigr) \;\stackrel{?}{=}\; \rho(h)\bigl(\mathcal H \xi\bigr)

— the operator H\mathcal H commutes with ρ(h)\rho(h). Behind the scenes, this works because every ingredient of H\mathcal H — the potential VV, the kinetic metric g^\widehat{\mathbf g}, and the equilibrium e^\widehat e — is PP-invariant.

Proof. Recall how H\mathcal H was defined: it is the unique linear operator on Te^Q^T_{\widehat e}\widehat Q satisfying

g^e^(Hξ,η)  =  ξ(η(V))e^for all ξ,ηTe^Q^\widehat{\mathbf g}_{\widehat e}(\mathcal H\xi,\, \eta) \;=\; \xi\bigl(\eta(V)\bigr)\big|_{\widehat e} \qquad \text{for all } \xi, \eta \in T_{\widehat e}\widehat Q

— in other words, H\mathcal H is built from the bilinear form ξ(η(V))e^\xi(\eta(V))|_{\widehat e} via the kinetic metric. To show H\mathcal H commutes with ρ(h)\rho(h), it suffices to show that both sides of this defining relation transform compatibly under ρ\rho. The metric is ρ\rho-invariant by construction — ρ(h)\rho(h) is a linear isometry. The new fact we need is that the bilinear form ξ(η(V))e^\xi(\eta(V))|_{\widehat e} is ρ\rho-invariant too. This follows from a single chain-rule fact:

Chain rule at a critical point. Let f ⁣:MRf \colon M \to \mathbb R be smooth and Φ ⁣:MM\Phi \colon M \to M a diffeomorphism with Φ(p)=p\Phi(p) = p. If dfp=0df_p = 0, then for any tangent vectors ξ,η\xi, \eta at pp,

ξ(η(fΦ))p  =  (dΦpξ)((dΦpη)(f))p.\xi\bigl(\eta(f \circ \Phi)\bigr)\big|_p \;=\; \bigl(d\Phi_p\, \xi\bigr)\Bigl(\bigl(d\Phi_p\, \eta\bigr)(f)\Bigr)\Big|_p.

The slogan: at a critical point of ff, the second-derivative bilinear form of fΦf \circ \Phi is ff‘s second-derivative form with both inputs replaced by their pushforwards through dΦpd\Phi_p.

We will apply the chain rule with f=Vf = V, Φ=Φh\Phi = \Phi_h, p=e^p = \widehat e, and dΦp=ρ(h)d\Phi_p = \rho(h). Combined with the metric invariance and the defining relation for H\mathcal H, this gives the equivariance directly. Apply each fact in turn to the expression g^e^(Hρ(h)ξ,ρ(h)η)\widehat{\mathbf g}_{\widehat e}(\mathcal H \rho(h)\xi,\, \rho(h)\eta):

g^e^(Hρ(h)ξ,ρ(h)η)  =  (ρ(h)ξ)((ρ(h)η)(V))e^(defining relation, with ρ(h)ξ,ρ(h)η)  =  ξ(η(VΦh))e^(chain rule, applied to VΦh)  =  ξ(η(V))e^(VΦh=V)  =  g^e^(Hξ,η)(defining relation, with ξ,η)  =  g^e^(ρ(h)Hξ,ρ(h)η)(metric invariance).\begin{aligned} \widehat{\mathbf g}_{\widehat e}\bigl(\mathcal H \rho(h)\xi,\, \rho(h)\eta\bigr) &\;=\; \bigl(\rho(h)\xi\bigr)\Bigl(\bigl(\rho(h)\eta\bigr)(V)\Bigr)\Big|_{\widehat e} && \text{(defining relation, with } \rho(h)\xi, \rho(h)\eta) \\ &\;=\; \xi\bigl(\eta(V \circ \Phi_h)\bigr)\big|_{\widehat e} && \text{(chain rule, applied to } V \circ \Phi_h\text{)} \\ &\;=\; \xi\bigl(\eta(V)\bigr)\big|_{\widehat e} && \text{(}V \circ \Phi_h = V\text{)} \\ &\;=\; \widehat{\mathbf g}_{\widehat e}\bigl(\mathcal H\xi,\, \eta\bigr) && \text{(defining relation, with } \xi, \eta) \\ &\;=\; \widehat{\mathbf g}_{\widehat e}\bigl(\rho(h)\mathcal H\xi,\, \rho(h)\eta\bigr) && \text{(metric invariance).} \end{aligned}

The first and last expressions are pairings against the same vector ρ(h)η\rho(h)\eta, and they agree for every η\eta. By non-degeneracy of the kinetic metric,

Hρ(h)ξ  =  ρ(h)Hξ.\mathcal H \rho(h)\xi \;=\; \rho(h)\mathcal H\xi.

Conclusion.

The operator H\mathcal H is PP-equivariant: Hρ(h)=ρ(h)H\mathcal H \circ \rho(h) = \rho(h) \circ \mathcal H for every hPh \in P.

This is the structural fact we will use everywhere from here on.

Rep theory to the rescue

Recall where we are. The spectral theorem gave us the orthogonal eigendecomposition V=λEλ\mathcal V = \bigoplus_\lambda E_\lambda from the self-adjointness of HV\mathcal H|_{\mathcal V}, and we posed two coarse questions about it: how many distinct eigenvalues are there, and what is the dimension of each eigenspace EλE_\lambda? The new structural input we have is that V\mathcal{V} carries a representation of PP, and the operator HV\mathcal H|_{\mathcal V} “plays well” with the PP-representation ρV\rho|_{\mathcal V} (they commute).

The first consequence of this commutation is that each eigenspace of H\mathcal H is itself a representation of PP. Take an eigenvector vv with eigenvalue λ\lambda. For any hPh \in P,

H(ρ(h)v)  =  ρ(h)(Hv)  =  ρ(h)(λv)  =  λρ(h)v,\mathcal H\bigl(\rho(h)\, v\bigr) \;=\; \rho(h)\bigl(\mathcal H\, v\bigr) \;=\; \rho(h)\bigl(\lambda v\bigr) \;=\; \lambda\, \rho(h)\, v,

so ρ(h)v\rho(h)\, v is also an eigenvector of H\mathcal H with the same eigenvalue λ\lambda. The eigenspace EλE_\lambda is therefore preserved by every ρ(h)\rho(h), and so EλE_\lambda inherits the action and becomes a representation of PP in its own right.

To go further, we use a key fact from rep theory.

Building blocks: irreducible representations

The first thing rep theory tells us is that any representation of a finite group splits into “indivisible” pieces. An irreducible representation (or irrep) is one with no proper PP-invariant subspace — the smallest possible representation.

Maschke’s theorem. Every finite-dimensional representation of a finite group is completely reducible: it decomposes as a direct sum of irreps.

So our V\mathcal V decomposes:

V    Vα1Vα2\mathcal V \;\cong\; V_{\alpha_1} \oplus V_{\alpha_2} \oplus \cdots

with each VαiV_{\alpha_i} an irrep of PP.

What does this give us, qualitatively? Two things. First, V\mathcal V is built out of simple ingredients — irreducible pieces that cannot be split further while respecting symmetry. Second, for any given finite group PP, there are only finitely many irreps up to equivalence. So the symmetry content of V\mathcal V is a list of pieces drawn from a small finite menu.

The same theorem applies to every eigenspace EλE_\lambda separately, since each is itself a representation. So we now know each eigenspace is a sum of irreps drawn from the same finite menu — its symmetry content is also captured by a list.

Isotypic decomposition

Different copies of the same irrep can appear in V\mathcal V, so it is clarifying to collect them. The isotypic decomposition groups equivalent irreps together:

V    αVαMα.\mathcal V \;\cong\; \bigoplus_\alpha V_\alpha \otimes M_\alpha.

The sum runs over irreps VαV_\alpha of PP that appear in V\mathcal V, and the multiplicity space MαM_\alpha is a plain vector space (no PP-action of its own) whose dimension counts how many copies of VαV_\alpha appear. The notation VαMαV_\alpha \otimes M_\alpha encodes ”dimMα\dim M_\alpha copies of VαV_\alpha”: PP acts irreducibly on the VαV_\alpha factor, trivially on MαM_\alpha.

The two factors play different roles. The VαV_\alpha piece carries the symmetry content — how vectors there transform under PP. The multiplicity space MαM_\alpha carries the bookkeeping — how many copies of that symmetry type are present. The distinction matters in a moment.

Schur’s lemma constrains H\mathcal H

Now we use H\mathcal H‘s equivariance. Each isotypic component VαMαV_\alpha \otimes M_\alpha is a PP-invariant subspace, so HV\mathcal H|_{\mathcal V} restricts to a PP-equivariant operator on each. The decisive question: what can such an operator look like?

Schur’s lemma. Any PP-equivariant linear map VαMαVαMαV_\alpha \otimes M_\alpha \to V_\alpha \otimes M_\alpha has the form idVαKα\mathrm{id}_{V_\alpha} \otimes K_\alpha for some endomorphism KαK_\alpha of the multiplicity space MαM_\alpha.

What does this tell us, qualitatively? H\mathcal H has no power to distinguish vectors within a single copy of an irrep. The whole copy is forced to behave the same way: H\mathcal H acts on it as the identity (times some scalar, set by where the copy sits in the multiplicity space). The only freedom H\mathcal H has is in how it mixes the different copies of an irrep among themselves. That mixing is encoded by KαK_\alpha on MαM_\alpha.

The reason this is forced: irreps are by definition the smallest PP-invariant subspaces; a PP-equivariant map can rearrange copies of an irrep, but it cannot rearrange anything within a single copy without breaking equivariance.

Apply this to HV\mathcal H|_{\mathcal V}. On each isotypic component VαMαV_\alpha \otimes M_\alpha, H\mathcal H acts as idVαKα\mathrm{id}_{V_\alpha} \otimes K_\alpha for some self-adjoint operator KαK_\alpha on MαM_\alpha (self-adjointness inherited from H\mathcal H). So HV\mathcal H|_{\mathcal V} is a direct sum of self-adjoint operators on the multiplicity spaces, one block per irrep:

HV  =  α(idVαKα).\mathcal H|_{\mathcal V} \;=\; \bigoplus_\alpha \bigl( \mathrm{id}_{V_\alpha} \otimes K_\alpha \bigr).

Aligning the two decompositions

We now have two direct-sum decompositions of V\mathcal V:

There is a clear always-true relationship between them, plus a sharper generic statement on top. Let’s separate the two.

Always true: each eigenspace splits into whole irreps. This is just Maschke, applied one more time. Each eigenspace EλE_\lambda is PP-invariant (we proved this), so it is itself a representation of PP. Apply Maschke directly to EλE_\lambda: it decomposes as a direct sum of irreps,

Eλ  =  αVαMα,λ.E_\lambda \;=\; \bigoplus_\alpha V_\alpha \otimes M_{\alpha, \lambda}.

The output of Maschke is by construction a sum of complete irreps — there is no “partial copy” possibility. So every irrep appearing inside EλE_\lambda appears as a full copy, classified by its type α\alpha, with dimMα,λ\dim M_{\alpha,\lambda} such copies (possibly 00, if irrep α\alpha does not appear at frequency λ\lambda).

Comparing with the global isotypic decomposition V=αVαMα\mathcal V = \bigoplus_\alpha V_\alpha \otimes M_\alpha: collecting the irrep-α\alpha pieces from each eigenspace recovers the global α\alpha-component, with Mα=λMα,λM_\alpha = \bigoplus_\lambda M_{\alpha, \lambda}. From the H\mathcal H side, this is exactly the spectral decomposition of KαK_\alpha on MαM_\alpha — its λ\lambda-eigenspace is Mα,λM_{\alpha,\lambda}.

Generic: each eigenspace is one irrep. In general, an eigenspace EλE_\lambda can contain pieces of multiple irrep types — if KαK_\alpha and KβK_\beta happen to share an eigenvalue λ\lambda, both contribute to EλE_\lambda. And KαK_\alpha itself can have repeated eigenvalues, putting multiple copies of VαV_\alpha into a single EλE_\lambda. Both are accidents: there is no symmetry reason for them, and a generic potential VV avoids them.

In the generic case — no accidents of either kind — the eigen decomposition coincides with the frequency decomposition itself, and each eigenspace EλE_\lambda is exactly one copy of one irrep VαV_\alpha. Each vibrational frequency has a single “symmetry type,” and its eigenspace is dimVα\dim V_\alpha-dimensional.

What do we learn? Going back to the coarse questions we posed:

Two different numbers in this story, easy to conflate:

The list of irreps appearing in V\mathcal V, together with their multiplicities {nα}\{n_\alpha\}, is the symmetry-only data of the spectrum: which irreducible blocks appear, at what dimension, with how many independent frequencies each. This list depends on PP and on the representation ρV\rho|_{\mathcal V} — both depending on the molecule and its symmetry, but neither depending on the potential.

The actual frequencies need VV (we still have to diagonalize each KαK_\alpha); the structural skeleton — the irrep at each frequency — does not.

Computing with characters

The decomposition VαVαMα\mathcal V \cong \bigoplus_\alpha V_\alpha \otimes M_\alpha exists abstractly, but to actually pin down the multiplicities {nα}\{n_\alpha\} we need a way to read them off the representation ρV\rho|_{\mathcal V} that doesn’t require us to explicitly diagonalize anything. Characters do this for us.

Given a finite-dimensional representation ρ ⁣:PGL(V)\rho \colon P \to \mathrm{GL}(V), its character is the function

χV ⁣:PR,χV(h):=trρ(h).\chi_V \colon P \to \mathbb{R}, \qquad \chi_V(h) := \mathrm{tr}\,\rho(h).

The map ρχV\rho \mapsto \chi_V seemingly throws almost all of our original representation away. χV\chi_V is not a homomorphism — taking traces destroys multiplication — it is just a real-valued function on PP, an element of the function space RP\mathbb{R}^P. The matrices are gone; only their traces remain.

What we trade up for is enormous simplification. RP\mathbb{R}^P is a finite-dimensional vector space. The complicated category of representations gets replaced by a small linear-algebra problem.

For this trade to be worthwhile, the map had better be lossless — knowing χV\chi_V should be enough to recover VV up to isomorphism. Amazingly, it is. This is the crucial theorem of character theory:

Characters determine representations. Two finite-dimensional representations of PP with the same character are isomorphic.

(There is no single agreed-upon name for this — Serre states it just as a corollary of the orthogonality relations below; it’s the load-bearing consequence of those relations.) Once we have this theorem, studying VV up to isomorphism is exactly the same as studying χV\chi_V as an element of a small vector space. Everything we want to know — what irreps appear, with what multiplicities — is encoded in χV\chi_V, and is to be extracted by linear algebra.

The rest of this subsection is the structural setup that makes the linear algebra work cleanly.

χV\chi_V is a class function. Trace is conjugation-invariant: χV(ghg1)=χV(h)\chi_V(g h g^{-1}) = \chi_V(h). So χV\chi_V doesn’t really take values on individual group elements — it takes values on conjugacy classes. The function space it actually lives in is the space of class functions

C(P)  :=  {f ⁣:PR  :  f(ghg1)=f(h) for all g,h},\mathcal{C}(P) \;:=\; \bigl\{\, f \colon P \to \mathbb{R} \;:\; f(g h g^{-1}) = f(h) \text{ for all } g, h \,\bigr\},

whose dimension equals the number of conjugacy classes of PP. This is much smaller than P|P| — for S3S_3 (P=6|P| = 6), dimC(P)=3\dim \mathcal{C}(P) = 3.

C(P)\mathcal{C}(P) has a natural inner product, and irreducible characters are an orthonormal basis. The inner product is the group average,

χ,χ  :=  1PhPχ(h)χ(h)  =  [h][h]Pχ(h)χ(h),\langle \chi, \chi' \rangle \;:=\; \frac{1}{|P|} \sum_{h \in P} \chi(h)\, \chi'(h) \;=\; \sum_{[h]} \frac{|[h]|}{|P|}\, \chi(h)\, \chi'(h),

(the second sum runs over conjugacy classes, legitimately because χ,χ\chi, \chi' depend only on the class). And:

Orthonormality of irreducible characters. The characters {χα}α\{\chi_\alpha\}_\alpha of the (finitely many) irreducible representations of PP form an orthonormal basis of C(P)\mathcal{C}(P).

This is what does the work behind the scenes: linear independence of the χα\chi_\alpha is what makes the multiplicities nαn_\alpha in χV=αnαχα\chi_V = \sum_\alpha n_\alpha \chi_\alpha uniquely determined — which is what makes “characters determine representations” true. Two consequences worth flagging: the number of irreps of PP equals the number of conjugacy classes of PP (both equal dimC(P)\dim \mathcal{C}(P)), and every class function expands uniquely against the basis {χα}\{\chi_\alpha\}.

Apply that last point to χV\chi_V itself. If VαVαnαV \cong \bigoplus_\alpha V_\alpha^{\oplus n_\alpha}, additivity of trace gives χV=αnαχα\chi_V = \sum_\alpha n_\alpha \chi_\alpha, and orthonormality reads off the coefficients:

nα  =  χV,χα.n_\alpha \;=\; \langle \chi_V,\, \chi_\alpha \rangle.

So the entire categorical question “what is VV, up to isomorphism?” reduces to: compute the trace of ρ(h)\rho(h) on VV for one hh in each conjugacy class — that’s all of χV\chi_V — and take the inner product against each χα\chi_\alpha.

Character tables

A character table packages the irreducible characters of a fixed group PP into a single grid: rows indexed by irreps VαV_\alpha, columns indexed by conjugacy classes [h][h], entries χα([h])\chi_\alpha([h]).

By the orthonormality theorem, this table records everything needed for representation arithmetic over PP. To find the multiplicities of irreps in any representation VV, you need only

  1. the character table of PP — a one-time lookup; and
  2. the character χV\chi_V of your representation — one number per conjugacy class.

Then nα=χV,χαn_\alpha = \langle \chi_V, \chi_\alpha \rangle is a finite sum: pair the values χV([h])\chi_V([h]) with the α\alpha-row of the table, weight each term by class size, sum, divide by P|P|.

A few features worth knowing for any character table. The first column (under EE) is always the dimension of the irrep (χα(E)=dimVα\chi_\alpha(E) = \dim V_\alpha). The first row is always the trivial representation (11‘s everywhere — every element acts as the identity on R\mathbb{R}). The number of rows equals the number of columns (both equal the number of conjugacy classes of PP). And the squared dimensions of the irreps sum to the order of the group:

α(dimVα)2  =  P,\sum_\alpha (\dim V_\alpha)^2 \;=\; |P|,

a useful sanity check.

Let’s now build the character tables for the two groups we will need: water’s and ammonia’s.

Water

Place water in the xzxz plane with the oxygen at the origin and the zz-axis bisecting the H–H line. Then PP has four elements:

These satisfy r2=s2=s2=er^2 = s^2 = s'^2 = e and r=ssr = s s'; any two generate the third. The group is abelian, and abstractly it’s Z/2×Z/2\mathbb{Z}/2 \times \mathbb{Z}/2 — the Klein four-group.

In an abelian group every conjugacy class is a singleton, so there are 44 classes and hence 44 irreps. The dimensions satisfy dα2=4\sum d_\alpha^2 = 4, which forces dα=1d_\alpha = 1 for all α\alpha — every irrep is 11-dimensional. A 11-dimensional rep of the Klein four-group is just a homomorphism into {±1}\{\pm 1\}, so it amounts to picking signs for the two generators independently. Four sign-choice combinations, four irreps. Label them Vηr,ηsV_{\eta_r, \eta_s} where (ηr,ηs){±}2(\eta_r, \eta_s) \in \{\pm\}^2 records the chosen signs:

Filling in the value at s=rss' = rs by multiplicativity (these are 11-dim reps, so χ(rs)=χ(r)χ(s)\chi(rs) = \chi(r)\chi(s)), the character table is:

eerrssss'
V++V_{++}11111111
V+V_{+-}11111-11-1
V+V_{-+}111-1111-1
VV_{--}111-11-111

The table is a 4×44 \times 4 matrix of ±1\pm 1‘s — orthogonality of rows is visible by inspection: any two distinct rows have equal counts of +1+1 and 1-1 in their pointwise product, summing to 00. Each row’s squared norm is 4=P4 = |P|, normalizing to unit length under the 1P\frac{1}{|P|}\sum inner product.

Ammonia

Place ammonia with the nitrogen on the zz-axis and the three H atoms forming an equilateral triangle in (a plane parallel to) the xyxy-plane. Then PP has six elements:

These satisfy r3=er^3 = e, si2=es_i^2 = e, and sirsi=r1s_i r s_i = r^{-1} (each reflection inverts the rotation). This is the dihedral group D3D_3 — the symmetry group of an equilateral triangle — equivalently S3S_3, the symmetric group on the three H atoms (each hPh \in P is determined by the permutation it induces on the three H labels).

Conjugacy classes: the rotations rr and r1=r2r^{-1} = r^2 are conjugate to each other (any reflection conjugates one to the other), and the three reflections are all conjugate to each other (the rotations cycle them). So we get three classes:

Three classes, hence three irreps. The dimensions satisfy dα2=6\sum d_\alpha^2 = 6, with at least one dα=1d_\alpha = 1 (the trivial rep), and the unique solution in positive integers is 1+1+4=12+12+221 + 1 + 4 = 1^2 + 1^2 + 2^2. So S3S_3 has two 11-dimensional irreps and one 22-dimensional irrep:

To fill in the table we need traces. For the 11-dim irreps the entries are just the sign assignments. For VstdV_{\mathrm{std}}: the value at ee is 22 (the dimension); a rotation by 2π/32\pi/3 in R2\mathbb{R}^2 has trace 2cos(2π/3)=12\cos(2\pi/3) = -1; a reflection in R2\mathbb{R}^2 has trace 00 (one +1+1 eigenvalue along the mirror, one 1-1 perpendicular). So:

{e}\{e\}{r,r2}\{r, r^2\}{s1,s2,s3}\{s_1, s_2, s_3\}
1\mathbf{1}111111
sgn\mathrm{sgn}11111-1
VstdV_{\mathrm{std}}221-100

Orthogonality is again checkable by inspection, now with the class-size weights: e.g. χ1,χVstd=16(112+21(1)+310)=0\langle \chi_{\mathbf{1}}, \chi_{V_{\mathrm{std}}} \rangle = \tfrac16(1 \cdot 1 \cdot 2 + 2 \cdot 1 \cdot (-1) + 3 \cdot 1 \cdot 0) = 0, and χVstd,χVstd=16(14+21+30)=1\langle \chi_{V_{\mathrm{std}}}, \chi_{V_{\mathrm{std}}} \rangle = \tfrac16(1 \cdot 4 + 2 \cdot 1 + 3 \cdot 0) = 1.

The standard rep VstdV_{\mathrm{std}} is the only 22-dimensional one in either of our tables — it is the source of the forced doublet degeneracies we’ll find in ammonia’s vibrational spectrum.

Computing χvib\chi_{\mathrm{vib}}

For our problem, V=VV = \mathcal V. We get to χvib\chi_{\mathrm{vib}} by computing on the larger space Te^Q^T_{\widehat e}\widehat Q and subtracting off the rigid-motion piece.

Total character χtotal\chi_{\mathrm{total}} on Te^Q^=iTxiXT_{\widehat e}\widehat Q = \bigoplus_i T_{x_i} X. Each ρ(h)\rho(h) is a block matrix in this decomposition. An atom xix_i that hh moves to a different atom xσh(i)xix_{\sigma_h(i)} \neq x_i contributes a zero diagonal block (its TxiXT_{x_i}X entries land in Txσh(i)XT_{x_{\sigma_h(i)}}X). An atom that hh fixes contributes tr(hTxiX)\mathrm{tr}\bigl(h|_{T_{x_i}X}\bigr). So only fixed atoms contribute:

χtotal(h)  =  i:σh(i)=itr(hTxiX).\chi_{\mathrm{total}}(h) \;=\; \sum_{i \,:\, \sigma_h(i) = i} \mathrm{tr}\bigl(h|_{T_{x_i}X}\bigr).

Rigid-motion character χN\chi_N on Ng/pN \cong \mathfrak{g}/\mathfrak{p}, with PP acting by the descended adjoint action. For X=R3X = \mathbb R^3 and a non-collinear molecule, p=0\mathfrak p = 0 and NgR3so(3)N \cong \mathfrak g \cong \mathbb R^3 \oplus \mathfrak{so}(3): three translations plus three rotations. PP acts on the translations by its inclusion PO(3)P \hookrightarrow O(3) (vector representation), and on the rotations by that same inclusion twisted by the determinant (axial-vector representation). So χN\chi_N splits as

χN(h)  =  χv(h)+det(h)χv(h)  =  (1+det(h))χv(h),\chi_N(h) \;=\; \chi_v(h) \,+\, \det(h)\, \chi_v(h) \;=\; \bigl(1 + \det(h)\bigr)\, \chi_v(h),

where χv(h)\chi_v(h) is the trace of hh in the vector representation. This is 2χv(h)2\,\chi_v(h) on proper rotations and 00 on improper ones.

Then

χvib  =  χtotalχN,\chi_{\mathrm{vib}} \;=\; \chi_{\mathrm{total}} \,-\, \chi_N,

and nα=χvib,χαn_\alpha = \langle \chi_{\mathrm{vib}}, \chi_\alpha \rangle.

Water

Coordinates as before — zz-axis along the rotation axis rr, molecule in the xzxz-plane, so ss is the molecular mirror and ss' is perpendicular. For each class:

Tabulating:

eerrssss'
χtotal\chi_{\mathrm{total}}991-13311
χN\chi_N662-20000
χvib\chi_{\mathrm{vib}}33113311

Inner-producting with each row of the character table (every class has size 11, P=4|P| = 4):

nV++=14(3+1+3+1)=2,nV+=14(3+131)=0,nV+=14(31+31)=1,nV=14(313+1)=0.\begin{aligned} n_{V_{++}} &= \tfrac14\bigl(3 + 1 + 3 + 1\bigr) = 2,\\ n_{V_{+-}} &= \tfrac14\bigl(3 + 1 - 3 - 1\bigr) = 0,\\ n_{V_{-+}} &= \tfrac14\bigl(3 - 1 + 3 - 1\bigr) = 1,\\ n_{V_{--}} &= \tfrac14\bigl(3 - 1 - 3 + 1\bigr) = 0. \end{aligned}

So

Vwater    2V++V+.\mathcal V_{\mathrm{water}} \;\cong\; 2\, V_{++} \oplus V_{-+}.

Total dimension 21+11=32 \cdot 1 + 1 \cdot 1 = 3, matching 3n6=33n - 6 = 3. ✓

What this tells us: water has 33 vibrational modes, all stamped with 11-dimensional irreps, so no forced degeneracies. The two V++V_{++} modes are invariant under all of PP (these are the symmetric stretch and the bend); the V+V_{-+} mode changes sign under both rr and ss' (the antisymmetric stretch). The two V++V_{++} modes form a 22-dimensional multiplicity space M++M_{++}, on which K++K_{++} is a 2×22 \times 2 self-adjoint operator — its two eigenvalues are the actual stretch and bend frequencies, set by the potential, with no symmetry obstruction to being whatever they are.

Ammonia

Three conjugacy classes — {e},{r,r2},{s1,s2,s3}\{e\}, \{r, r^2\}, \{s_1, s_2, s_3\}.

{e}\{e\}{r,r2}\{r, r^2\}{s1,s2,s3}\{s_1, s_2, s_3\}
χtotal\chi_{\mathrm{total}}12120022
χN\chi_N660000
χvib\chi_{\mathrm{vib}}660022

Inner-producting (class sizes 1,2,31, 2, 3; P=6|P| = 6):

n1=16(161  +  201  +  321)=2,nsgn=16(161  +  201  +  32(1))=0,nVstd=16(162  +  20(1)  +  320)=2.\begin{aligned} n_{\mathbf{1}} &= \tfrac16\bigl(1 \cdot 6 \cdot 1 \;+\; 2 \cdot 0 \cdot 1 \;+\; 3 \cdot 2 \cdot 1\bigr) = 2,\\ n_{\mathrm{sgn}} &= \tfrac16\bigl(1 \cdot 6 \cdot 1 \;+\; 2 \cdot 0 \cdot 1 \;+\; 3 \cdot 2 \cdot (-1)\bigr) = 0,\\ n_{V_{\mathrm{std}}} &= \tfrac16\bigl(1 \cdot 6 \cdot 2 \;+\; 2 \cdot 0 \cdot (-1) \;+\; 3 \cdot 2 \cdot 0\bigr) = 2. \end{aligned}

So

Vammonia    212Vstd.\mathcal V_{\mathrm{ammonia}} \;\cong\; 2\, \mathbf{1} \oplus 2\, V_{\mathrm{std}}.

Total dimension 21+22=62 \cdot 1 + 2 \cdot 2 = 6, matching 3n6=63n - 6 = 6. ✓

Now we see the forced doublets explicitly: ammonia has 66 vibrational modes, organized as 22 singlets (copies of the trivial rep 1\mathbf{1}) plus 22 doublets (copies of VstdV_{\mathrm{std}}). The 1\mathbf{1}-block K1K_{\mathbf{1}} is 2×22 \times 2 on M1M_{\mathbf{1}}, contributing two singlet frequencies; the std\mathrm{std}-block KstdK_{\mathrm{std}} is 2×22 \times 2 on MstdM_{\mathrm{std}}, and each of its eigenvalues comes with multiplicity 22 in the full spectrum, because VstdV_{\mathrm{std}} is 22-dimensional. So generically ammonia’s spectrum has 44 distinct frequencies: two nondegenerate (the two trivial-rep copies) plus two doubly degenerate (the two standard-rep copies). Physically these are commonly described as a symmetric stretch and an “umbrella” inversion (the singlets) plus an asymmetric stretch pair and an asymmetric bend pair (the doublets).

What this computes — and what it doesn’t

What the character procedure gives us is the list {nα}\{n_\alpha\}, which by Schur tells us exactly how the spectrum decomposes by irrep — how many singlets, how many doublets, how many triplets, with all forced degeneracies accounted for. We learn the structural skeleton of the spectrum from PP and the molecule alone, before any eigenvalue problem is solved.

What we don’t get is the numerical values of the frequencies. Each KαK_\alpha is a small self-adjoint operator on the multiplicity space MαM_\alpha, and to compute its eigenvalues we need VV. Representation theory gives the structure; the dynamics gives the numbers.

This is the punchline. The character-table procedure that chemists use is doing exactly the symmetry-only part of the analysis — neither more nor less than what character orthogonality lets representation theory see. The remaining numerical eigenvalue problem inside each α\alpha-block is a separate, smaller calculation, doable molecule-by-molecule once a potential is chosen.

Epilogue: What the generality bought us

Working with an abstract (X,g)(X, g) throughout, rather than narrowing to R3\mathbb R^3 from the start, was free — the derivation never used anything about XX that wasn’t packaged into “Riemannian manifold with isometry group.” As a payoff, we get a few structural facts that aren’t obvious from the chemistry-textbook treatment.

Curvature is invisible to the harmonic machinery. The Riemann curvature of g^\widehat{\mathbf g} made a brief appearance in the linearization (in the R(S,T)TR(S, T)\, T term) and then vanished, because T=0T = 0 at the equilibrium. The whole construction — the operator H\mathcal H, its self-adjointness, the rep-theoretic decomposition of its spectrum — went through without ever using the curvature of (X,g)(X, g). Curvature first enters at cubic order, in the anharmonic corrections.

(The numerical eigenvalues of H\mathcal H do still depend on the ambient space, since the potential VV is typically built from pairwise geodesic distances, which differ between R3\mathbb R^3, S3S^3, H3\mathbb H^3, and so on. What’s curvature-independent is the form of the harmonic theory — same construction, same self-adjointness, same Schur structure — not the values of the frequencies.)

The vibrational dimension is ndimXdimGn \dim X - \dim G. For a non-collinear molecule the dimension of the vibrational subspace V\mathcal V is the total degrees of freedom minus the dimension of the ambient isometry group:

dimV  =  ndimX    dimG.\dim \mathcal V \;=\; n \dim X \;-\; \dim G.

For a molecule of nn atoms in R3\mathbb R^3 (dimG=6\dim G = 6) this is the chemist’s familiar 3n63n - 6. On S3S^3 and H3\mathbb H^3 it is also 3n63n - 6 (dimG=6\dim G = 6 in both cases). On Nil\mathrm{Nil} (dimG=4\dim G = 4) it is 3n43n - 4, and on Sol\mathrm{Sol} (dimG=3\dim G = 3) it is 3n33n - 3 — different ambient geometries support different numbers of “rigid motions” of a generic molecule, and the vibrational dimension absorbs the difference.

Mode organization is rep theory of PP on V\mathcal{V}. The way the spectrum is organized into blocks — which irreps appear, with what multiplicities, with what forced degeneracies — is determined entirely by the representation ρV\rho|_{\mathcal V} of the symmetry group PP on the vibrational subspace. This depends on the molecule, but not on the potential, and not on whether the ambient space is curved or flat (but it does depend on the size of the isometry group). Same character tables, same forced degeneracies, regardless of (X,g)(X, g).

← All posts