Notes to myself from early college about what divergence means.

Steve Trettel


The divergence operator in vector calculus is often defined by the following formula in introductory multivariate calculus courses: $\nabla\cdot \vec{F}=\frac{\partial F_1}{\partial x}+\frac{\partial F_2}{\partial y}+\frac{\partial F_3}{\partial z}$, but little is done to explain why we would care about a quantity defined in this way, or even how the sum of a bunch of partial derivatives somehow measures the way in which a vector field spreads out.My goal here is first off to give some motivation for why we would want to define a quantity like divergence, and how to go about it. Later then, I will show the definition we came up for divergence is actually equal to the one above, although they at first look very different.

What do we want divergence to be?

The divergence is a scalar field that we associate with a vector field, which aims to give us more information about the vector field itself. Much like the gradient of a function provides us with the direction and magnitude of the greatest increase at each point, the divergence provides us with a measure of how much the vector field is “spreading out” at each point.

Intuitively, we want the divergence to give us the following assessments of vector fields: if the vectors here all point away from the black dot, causing a positive outward flow, we want to say that the field has positive divergence there.

If the vectors all point inwards, so there is a “negative outward flow,” want to say that the field has negative divergence there.

And if the vectors seem to be flowing right through the black dot, neither spreading out or converging on it, we’d like to say that the divergence is zero.

How do we formalize this?

Note that if we want to define divergence so that it agrees with these above intuitions, the definition can’t depend on the value of the vector field at the point we are interested in. In the first two examples (where we want to assign nonzero divergence), the vector field is actually zero at the point of interest, whereas in the last example (where we want to assign zero divergence), the vector field is nonzero. Instead, we want to know what the vector field is doing “around” the point of interest, and more specifically, we want to know how much the vector field is “flowing towards or away” from that point.

Surface Integrals

One way we have at our disposal to measure the “amount of flow” is the surface integral: given a surface $S$ in a vector field $F$, the integral $$\int_S \vec{F}\cdot d\vec{S}$$ Gives the amount of flow (known as flux) through the surface. So, if we want to know how much “stuff” is flowing towards/away from a point $p$, we could encase $p$ in a surface and look at the flux through that surface. Although the actual visual should be a 3-dimensional vector field with a surface around $p$ like on the right below, we will instead keep opting for looking at 2-dimensional pictures (you can think of them as cross sections of these 3d pictures if you’d like) so that it’s easier to see whats going on.

The amount of the vector field “moving outwards through the surface” is just the surface integral about this surface. There is one slight technicality here: as with all surface integrals, we need to choose an orientation for our surface, which pretty much amounts to a choice of which side of the surface is “outside” and which is “inside”. Since we are forming closed surfaces around a point, there’s a pretty obvious option here, we are going to count the region with the point as the inside and the rest the outside. Under this choice, it turns out that our surface integrals are positive when the vector field flows from the point to the outside, and negative when it flows from the outside towards the point, just like we want divergence to do. Because the surface integral around a point seems to do exactly what we want divergence to do, we would like to make some sort of definition along the lines

$$\mathrm{div}{\vec{F}}\stackrel{?}{=}\int_S \vec{F}\cdot d\vec{S}$$

Where our surface$S$ surrounds the point of interest,$p$. Trying to do this directly leads to some problems however, the most notable being the choice of surface $S$. Let’s say we are interested in a point where the vector field is flowing outwards in all directions from it, and so we expect a positive divergence at that point.

However, zooming out a bit, we notice the vector field is doing some funny things a bit farther away; the flow does not continue outwards forever, but flows in a more complex pattern.


Now, if we want to measure the outflow from $p$ using a surface integral, we can see that the amount of flow we register will definitely depend on the shape of the surface we choose. If the surface is rather small about $p$ we will get a positive value, as expected, but if our surface is a bit larger and includes inside one of the “sinks” as well as the source at $p$, the net flow through the surface will be very different as at some locations the flow is outwards from $p$, and at other locations it is inwards towards the sink.
The two pictures below illustrate what can happen with such different sized loops.

Now obviously the first one of these cases is a better measure of what’s happening at $p$, and the reason for this is that the chosen surface is small enough that it doesn’t include any of the “weird” points of the vector field that are far away. So, it seems to get a better idea of what is going on at p, its useful to choose a smaller surface.

How small? We can’t just choose a size (like, say, 1mm in radius) and expect it to work for all vector fields, because for any given sized surface it’s possible to construct a vector field that has “weird points” inside of it, much like in the second example above. Instead, to follow the intuition that smaller surfaces lead to better estimates, what we need to do is shrink the size of our surface all the way down to zero. That is, we want to look at the limit as the surface itself converges on our point of interest, $p$.

To express this more precisely, let $V$ be a small 3-d region about our point $p$, and $\partial V$ be the surface of that region. To say that our surface shrinks and converges to $p$, we can just as well say that the region $V$ itself shrinks to a point (carrying the surface with it).We will express this (still admittedly imprecise) idea in symbols as $V\to {p}$.Recalling that given a region $V$ we can write the surface integral about its boundary as $\int_{\partial V} \vec{F}\cdot d\vec{S}$, as we let our region shrink down to the point of interest, we can write the limiting value of the surface integral as

$$\mathrm{div}\vec{F}(p)\stackrel{?}{=}\lim_{V\to{p}}\int_{\partial V} \vec{F}\cdot d\vec{S}$$


However, in trying this we run into another problem: this limit is always going to be zero! To see why, let’s think what happens to the flux of a vector field through smaller and smaller surfaces. In the pictures below, we are looking at a sequence of surfaces converging to a point which should intuitively have positive divergence.

We can picture the surface integral as “adding up” all of the contributions of all the vectors which lie on the surface, but as the surface gets smaller, less and less vectors lie on the surface, and so the value will get smaller as the surface shrinks. As the size of the surface goes to zero then, the number of vectors lying on it will also head towards zero. (“Number” here is being used very loosely, to mimic the fact that in our drawings there are less vectors on smaller curves; but the mathematics carries through and shows that our intuition that the value is zero is in fact correct).

One way around this is to consider the “flux per unit volume,”
instead of just the straight-up flux, and effectively normalize our integral by the size of the volume enclosed by the surface. This way, even though bigger surfaces have more flux through them, they also enclose more volume, and so the two effects would cancel each other out. This will leave us with a normalizing factor out in front of the integral, where we write $|V|$ for the volume of $V$.

$$\frac{1}{|V|}\int_{\partial V}\vec{F}\dot d\vec{S}$$

Now that our integral no longer tends to zero as the surface shrinks, we can again take the limit as the chosen region converges to $p$. As this encapsulates what we want a quantity like divergence to measure: the flux per unit volume around a point, we can define this as the divergence of $F$ at $p$.

$$\mathrm{div} \vec{F}(p)=\lim_{V\to {p}}\frac{1}{|V|}\int_{\partial V}\vec{F}\cdot d\vec{S}$$

To quickly recap what we’ve done so far, starting with a point in a vector field, we have found a way to attach a number, called the divergence, which measures how much the vector field is expanding or contracting at that point. The way to arrive at such a quantity is to look at the flow through small surfaces around the point, and then take the limit as these surfaces shrink away. This is all good and dandy, and sure provides a better intuitive picture than the standard definition as to why divergence measures flow, but it has a serious drawback: it looks god awful to try and compute. First off, we are taking the limit over ALL regions converging on p, and secondly for each one we have to compute a surface integral! This is a situation we run into a lot with coordinate free definitions like this, they get to the heart of the matter intuitively, but without a handy coordinate system they seem hopeless to use.

How do we compute this?

To make this definition useful (and to bring us full circle in explaining why the standard formula for divergence looks the way it does) we will evaluate the divergence of a vector field in Cartesian (xyz) coordinates. The main advantage that choosing a coordinate system gives us is that instead of considering all possible regions about p, we can hand-pick “nice” regions which have simple descriptions in our coordinate system. In Cartesian coordinates, cubes are particularly easy to work with, so we will stick to those.

We will use $C$ as our symbol for the solid cube itself, $\partial C$ for the surface of the cube, and we will replace our shorthand expression $\vec{F}\cdot d\vec{S}$ for the surface integral with $\vec{F}\cdot \hat{n} dS$, where $\hat{n}$ is the outward-facing unit normal vector to our cube. Thus, for a given cubical region, the surface integral we are interested in will be

$$\frac{1}{|C|}\int_{\partial C}]\vec{F}\cdot\hat{n}dS$$

The first nice property of cubes (and one of the reasons we are choosing them as “special” region to investigate) is the simple formula for their volume. If our cube has side length h, we immediately know that $|C|=h^3$, which lets us write the above integral as

$$\frac{1}{h^3}\int_{\partial C}]\vec{F}\cdot\hat{n}dS$$

The main property that makes cubes a good choice of region to work with in Cartesian coordinates is that you can choose them so their faces are lined up with the coordinate planes.
This gives us some very simple descriptions of the normal vectors to our surface:

The front and back faces have forwards and backwards pointing normal vectors, the left and right faces have left and right facing normal vectors, and the top and bottom faces have up and down pointing normal vectors. As front/back, left/right, and up/down are the standard coordinate directions in Cartesian coordinates (usually labeled x, y, and z respectively), we can see that the normal vectors to our cube line up nicely with the coordinate vectors.

In fact, this gives us a nice way to break up our cube: since a cube can be looked at as six squares arranged together with their edges touching, to evaluate the surface area over a cube we can just as easily evaluate the surface integral over each face and add the results together. Pictorially, this amounts to us breaking the cube apart, as below.

We can pair these six faces into three sets, the front/back, the left/right, and the up/down. We will refer to these as the $X$-Faces, the $Y$-Faces, and the $Z$-Faces respectively, and denote them in our integrals by $X$’s, $Y$’s and $Z$’s.

Performing this decomposition of the cube mathematically, we can break the surface integral into three pieces:

$$\int_{\partial C}]\vec{F}\cdot\hat{n}dS$$ $$=\int_{X’s}\vec{F}\cdot\hat{n}dS+\int_{Y’s}\vec{F}\cdot\hat{n}dS+\int_{Z’s}\vec{F}\cdot\hat{n}dS$$

To keep things simple, we will only worry about the integral over the $X$-Faces for the time being (the other two will work out identically). So for now, our task is to evaluate


This integral is taken over both the front and back face of the cube, so like we did with the entire cube, we will break it into two integrals, one for each face:


Because the normal vector to the front face is in the direction of the positive $x$-axis, the dot product just tells us the component of $\vec{F}$ which is in the direction of the positive x-axis. If we write our vector field itself in Cartesian coordinates as


This component is just $F_1$ itself. On the back face, the normal vector is now pointing in the direction of the negative $x$-axis (remember, the normal vectors were chosen to be the outward pointing normals to the cube as a whole, so if the front face points out/forward, the back face normal points out/backward). Thus, the dot product of $F$ with this outwards facing normal is just the negative of the first component.

$$\int_{X_{\mathrm{Front}}}\vec{F}\cdot\hat{n}dS=\int_{X_\mathrm{Front}}F_1dS$$ $$\int_{X_{\mathrm{Back}}}\vec{F}\cdot\hat{n}dS=-\int_{X_\mathrm{Back}}F_1dS$$

Now, all we have left to do is carry out these two (non-vector!) integrals.
To make things easy, let’s say the point p that our cube encapsulates is at the origin, and that our cube surrounds it evenly on all sides.
This means the front face of the cube is h/2 out in the direction of positive $x$, and the back face is h/2 units in the other direction. Because the faces themselves are parallel to the $yz$ plane, we can write these surface integrals as simple double-integrals over these squares.


So far, we have been working with this integral over the cubical surface as though the cube we are interested in has some fixed size; but we are after all trying to calculate the divergence of the vector field at p, which we had defined as the limit as the cube shrinks to zero size. This is a GOOD thing for us, because we can pretend the cube we are integrating over is ridiculously small, and make some approximations. Namely, we are going to assume that the cube is so small that our function F1 takes basically a constant value over the entire front (respectively back) face. This tells us that the scary double integral above is basically the (constant) value of the function over the square, times the area of the square.

$$\int_{-h/2}^{h/2}\int_{-h/2}^{h/2}F_1(h/2,y,z)dydz\simeq h^2F_1(h/2,0,0)$$

Doing this same approximation trick for the back-face (remembering the negative sign) and putting it all together, we get

$$\int_{X’s}\vec{F}\cdot\hat{n}dS\simeq h^2F_1(h/2,0,0)-h^2F_1(-h/2,0,0)$$

Or in words, the value of the surface integral over those two parallel faces is more or less a scaled version of the difference in the value of $F_1$ over those faces, since they are really small.

As we make the cube smaller and smaller (let h tend towards zero), this value here will also tend to zero (because each term is multiplied by h-squared).
This is in fact exactly the reason we introduced the normalizing factor of dividing by the region’s volume earlier; which we have not looked at yet. Bringing it back into the picture,

$$\frac{1}{h^3}\left(\int_{X’s}\vec{F}\cdot\hat{n}dS\right)$$ $$\simeq \frac{1}{h^3}\left(h^2F_1(h/2,0,0)-h^2F_1(-h/2,0,0)\right)$$

Which simple algebra lets us simplify to


If we let h tend to zero in this formula however (as the volume shrinks to a point), we have exactly the definition of the derivative on our hands. That is,

$$\lim_{h\to 0}\frac{F_1(h/2,0,0)-F_1(-h/2,0,0)}{h}=\frac{\partial F_1}{\partial x}(0,0,0)$$

And so, for very small cubes, the integral over the front and back faces tends to the x-derivative of the x-component of F. By identical reasoning, the integral over the left and right faces tends to the y-derivative of the y-component, and the top and bottom integrals result in the z-derivative of the z-component.

This tells us that as our cube shrinks to zero size,

$$\frac{1}{h^3}\int_{\partial C}\vec{F}\cdot\hat{n}dS$$ $$=\frac{\partial F_1}{\partial x}(0,0,0)+\frac{\partial F_2}{\partial y}(0,0,0)+\frac{\partial F_3}{\partial z}(0,0,0)$$

But, the shrinking of our cube to zero size is exactly the definition of divergence!
This lets us write (removing our simplifying assumption that $p=0$)

$$\mathrm{div}F(p)=\frac{\partial F_1}{\partial x}(p)+\frac{\partial F_2}{\partial y}(p)+\frac{\partial F_3}{\partial z}(p)$$

Which is exactly the usual formula taught in class!