I have to start by saying I don't know anything about the derivative method shown in this excerpt. I tried some calculations but it doesn't even seem to give the same result as the standard definition, so I'm guessing he is calculating something different from what we call "moments" in modern physics. Anyway, by way of explanation:
The word "moment" is used for several different purposes in physics, so it can be kind of a confusing term because you have to know what is meant by the context. But all the various meanings of moment stem from its definition in math.
In math, a moment is a way of characterizing some distribution. It could be a probability distribution, a mass distribution, a charge distribution, or anything similar; all you need is some function $f(x)$ which defines the density of the quantity (mass/charge/probability) in question. In other words, $\int_a^b f(x)\;\mathrm{d}x$ is the amount of "stuff" between $a$ and $b$.
The $n$th mathematical moment of a distribution with density function $f(x)$ around a point $c$ is computed by a very simple formula:
$$I^{(n)}(x_0) = \int (x - c)^n f(x)\ \mathrm{d}x$$
This generalizes to higher-dimensional spaces, but then the moment becomes an $n$-index tensor:
$$I_{i_1\cdots i_n}^{(n)}(\mathbf{r}_0) = \idotsint \prod_{j=1}^{d}(r_{i_j} - c_{i_j}) f(\mathbf{r})\ \mathrm{d}^d\mathbf{r}$$
In physical applications, the definitions used are a little different, but in general an $n$th moment involves the integral of some $n$th power of position multiplied by the distribution function $f(\mathbf{r})$. (The aforementioned differences show up in how you use the various components of $\mathbf{r}$ to compute that $n$th power.)
Many typical measures used to describe physical systems or mathematical distributions can be represented as moments. For example:
If $f(x)$ is a 1D probability distribution:
- The normalization constant (which is 1) is $I^{(0)}$
- The mean value is $\langle x\rangle = I^{(1)}(0)$
- The variance is $I^{(2)}(\langle x\rangle)$
If $f(\mathbf{r})$ is a mass distribution:
- The total mass is $I^{(0)}$
- The center of mass is $I^{(1)}(0)$ (from which comes the term "weighted average")
- The moment of inertia around any point $\mathbf{c}$ is a second moment
If $f(\mathbf{r})$ is a charge distribution:
- The total charge, or monopole moment, is $I^{(0)}$
- The dipole moment is $I^{(1)}(0)$
- The quadrupole moment is a second moment
- and so on
For charge distributions, the quantities $I^{(n)}(0),\ n=0,1,2,\ldots$ (as modified with the required extra terms) are called the electric multipole moments $Q^{(n)}$. These quantities are of particular interest because you can expand the electric potential of an arbitrary charge distribution in terms involving successive moments:
$$\Phi(\mathbf{r}) = \sum_{n=0}^{\infty} \sum_{\{i_j\}}\frac{C_n Q_{i_1\cdots i_n}^{(n)}x_{i_1}\cdots x_{i_n}}{r^{2n+1}} \sim \sum_n \frac{C_n Q^{(n)}}{r^{n+1}}$$
In many situations, $r$ is relatively large so it's sufficient to use only the first nonzero term of this series in a calculation. In a sense, higher moments incorporate more detailed features of the charge distribution, which "blur out" and thus have little effect at large distances.
For the example you're looking at here, it sounds like Pearson is calculating the moments of area in the $x$ dimension around the origin - in other words, the density function $f(x)$ is the function that would trace along the tops of the rectangles.
$$f(x) = a\binom{n}{k}p^{n-k}q^k,\quad \tfrac{(2k + 1)c}{2} \le x < \tfrac{(2k + 3)c}{2}$$
(you could think of this as calculating the moments of mass of a cardboard cutout of the binomial distribution, assuming the cardboard is uniform density).
You can plug this into the integral definition of a moment, although the resulting expression is rather complicated, and as I said, it doesn't seem to give the same results as the derivative method Pearson is using. So I believe he's calculating something different.
Given some distribution or density $\rho(x),$ a moment is the 'expectation value' of some power of $x \in \mathbb{R}$. To be precise, the $n$-th moment $M_n$ is given by
$$M_n = \int_{\mathbb{R}} x^n \rho(x) \mathrm{d}x.$$
In the mechanics case, $\rho(x)$ is simply the mass density.
You can extend this to vectors in $\mathbb{R}^d$ in a straightforward way; for example, for the moment of inertia you replace $x^2$ by $\mathbf{x}^2 = x_1^2 + \ldots x_d^2$ to obtain
$$I = M_2 = \int_{\mathbb{R}^d} \mathbf{x}^2 \rho(\mathbf{x}) \mathrm{d}^dx$$
which should match the definition given in your mechanics textbook.
For the first moment of mass, you need to distinguish different directions. As you indicate, you can choose your coordinates such that
$$\int_{\mathbb{R}^d} x_i \,\rho(\mathbf{x}) \mathrm{d}^d x = 0$$
where $i$ runs over the coordinates. In three dimensions, you have $x_1 = x, x_2=y$ and $x_3=z.$
Best Answer
In statistics the third moment is used to calculate skewness. I would guess this has a physical analogy. Although I haven't thought this through, I'd guess it would be possible to take a disk and deform it asymmetrically so that the centre of mass and moment of inertia remained the same, but the third moment changed. In this case the third moment would be telling you about the asymmetry of the disk.