A 2-form is a function that eats a parallelogram (technically it eats 2 vectors, which you should think of as spanning a parallelogram) and spits out a number proportional to its area. A 3-form eats a parallelepiped (the 3-dimensional analog of a parallelogram) and spits out a number proportional to its volume. A 4-form eats a 4-dimensional parallelotope and spits out a number proportional to its hypervolume. A 1-form eats a line segment (which you can think of as a 1-dimensional parallelogram) and spits out a number proportional to its length. A 0-form eats a single point (which you can think of as a 0-dimensional parallelogram) and spits out a number, though there's nothing for it to be proportional to since a point has no extension in space. I think you get the picture. In general an n-form eats n vectors, which you should think of as spanning an n-dimensional parallelotope, and spits out a number proportional to its hypervolume.
Usually books that teach differential forms obscure this. They will define an n-form as a "real-valued multilinear, skew-symmetric function of n vectors". But it means the same thing. Multilinearity and skew-symmetry = output is proportional to length/area/volume/hypervolume. The determinant, which is used to compute the volume of a parallelepiped (and its higher and lower dimensional analogs), has the same two properties.
So why do we require forms to have this property? Well it's just because it's needed for integration. Imagine a curve you want to integrate over. The first step is to approximate it with line segments. Then you apply some function to each line segment in order to get a number. You need that number to shrink as the size of the line segment shrinks otherwise the sum won't converge. Think about it, if the output of the function was independent of the length of the input, then as more segments were added to the approximation the sum would just shoot up to infinity. Now think of a surface you want to integrate over. You can approximate it with parallelograms, imagine the scales of an armadillo. Then for each parallelogram you apply some function that spits out a number. We need the numbers to shrink as the scales do so the sum actually converges. If you want to integrate over some 3-dimensional volume, approximate it with parallelepipeds and again evaluate a function for each parallelepiped. The output of this function needs to shrink with its input for the sum to converge. These functions that we integrate over curves/surfaces/volumes/hypervolumes are forms.
Now let me explain why you write forms as linear combinations of elementary forms. It has to do with the generalized Pythagorean theorem, which I'll just call the GPT. In the same way that the length of a line segment is equal to the sum of the squared lengths of its projections onto the various coordinate axes, the area of an arbitrary parallelogram is equal to the sum of the squared areas of its projections onto the various coordinate planes. And the volume of a parallelepiped is equal to the sum of the squared volumes of its projections onto the various 3-dimensional subspaces. And so on. So the Pythagorean theorem applies to more than just line segments.
So let's look at the example of a 1-form that eats line segments embedded in 3-dimensional space. In general it's gonna look like $adx + bdy + cdz$ (if you forgot, $dx$, $dy$, and $dz$ are just functions that eat a line segment and spit out its projections on the x axis, y axis, and z axis respectively). All that's happening is you're taking the dot product of a vector $(a,b,c)$ with another vector $(dx,dy,dz)$ which equals the projection of $(a,b,c)$ onto $(dx,dy,dz)$ times the length of $(dx,dy,dz)$ (the length of $(dx,dy,dz)$ is $\sqrt{dx^2 + dy^2 + dz^2}$ ie the length of the line segment by the GPT). In other words $adx + bdy + cdz$ is literally just another way of writing: (projection of $(a,b,c)$ onto $(dx,dy,dz)$) times (length of the line segment). Since the length of the line segment is a factor in this product, the function is obviously proportional to the length of the line segment. Any 1-form can be written like this.
Another example: A 2-form that eats parallelograms embedded in 3-dimensional space is gonna have the form $a(dx \wedge dy) + b(dx \wedge dz) + c(dy \wedge dz)$ (if you forgot, $dx \wedge dy$, $dx \wedge dz$, and $dy \wedge dz$ are just functions that eat parallelograms and spit out the areas of their projections on the xy, xz, and yz planes respectively). So this is just another way of writing the dot product of $(a,b,c)$ and $(dx \wedge dy, dx \wedge dz, dy \wedge dz)$ which is just the projection of $(a,b,c)$ onto $(dx \wedge dy, dx \wedge dz, dy \wedge dz)$ times the length of $(dx \wedge dy, dx \wedge dz, dy \wedge dz)$ (which is $\sqrt{(dx \wedge dy)^2 + (dx \wedge dz)^2 + (dy \wedge dz)^2}$ ie the area of the parallelogram by the GPT). In other words the linear combination is just equal to: (projection of $(a,b,c)$ onto $(dx \wedge dy, dx \wedge dz, dy \wedge dz)$) times (area of the parallelogram). Which is clearly a function proportional to the area of the parallelogram.
Another example: A 2-form that eats parallelograms in the plane. It has the general form $a(dx \wedge dy)$. You only need one term because $dx \wedge dy$ already gives you the area of the parallelogram. In the same way $dx$ gives you the length of your line segment if you're only in 1 dimension. It's only when you're in a dimension higher than the dimension of the line segment/parallelogram/parallelepiped/parallelotope that you're gonna have to invoke the GPT ie have a linear combination of multiple elementary forms.
So hopefully you see that differential forms are actually very simple objects. They're merely generalized integrands. Other things in exterior calculus like the exterior derivative, the generalized stokes theorem, etc are similarly very simple when explained properly.
edit: a slightly cleaned up version of this post with some pictures can be found here: https://simplermath.wordpress.com/2020/02/13/understanding-differential-forms/
Generally speaking if you have a tensor $T$ on a manifold, and if you have a collection (of usually coordinate) vector fields $e_1, \cdots, e_n$ the "index notation" for $T$ is (lets assume for a moment $T$ is bilinear):
$$T_{ij} = T(e_i,e_j)$$
meaning $T_{ij}$ is a real-valued function for all $i$, and $j$. $T_{ij}$ is defined wherever the vector fields $\{ e_i : i = 1,2,\cdots n\}$ are defined. On a manifold with a metric (meaning an inner product on every tangent space), it is typical to define
$$g_{ij} = \langle e_i, e_j \rangle$$
where $\langle \cdot, \cdot \rangle$ is the inner product on the tangent spaces.
If the tensor takes something other than two vectors as input, for example the Riemann curvature tensor is sometimes thought of as a bilinear function from the tangent space to the space of skew-adjoint linear transformations of that tangent space, i.e. at every point $p$ of the manifold it is bilinear $T_p N \oplus T_p N \to Hom(T_p N, T_p N)$ taking values in the skew-adjoint maps (with respect to the inner product). So given $e_i, e_j \in T_p N$, $R(e_i,e_j)$ is a linear functional on the tangent space, so you could express $R(e_i,e_j)(e_k)$ as a linear combination of vectors in the dual space $T^*_p N$. The standard basis vectors of the dual space (corresponding to the collection $\{e_i\}$) is typically denoted $e_1^*, \cdots, e_n^*$. So you write $R(e_i,e_j)(e_k) = \sum_l R^l_{ijk}e^*_l$, and call $R^l_{ijk}$ the Riemann tensor "in coordinates".
In case any of this is unfamiliar, $e^*_j(e_i) = 1$ only when $i=j$ and $e^*_j(e_i) = 0$ otherwise. Or "in coordinates" $e^*_j(e_i) = \delta_{ij}$.
I think many intro general relativity textbooks explain this fairly well nowadays. When I was an undergraduate I liked:
- A First Course in. General Relativity. Second Edition. Bernard F. Schutz.
Best Answer
A differential k-form on an n-manifold can be visualised as a "density" of (n - k) submanifolds. In $\mathbb{R}^3$ a 3-form is a point density. The form $dx \wedge dy \wedge dz$ is a uniform point density such that there is 1 full point in a unit cube. The integral $\int_S dx \wedge dy \wedge dz$ measures the number of points inside $S$, which is equal to the volume of $S$. The form $f dx \wedge dy \wedge dz$ is a point density with density $f(x,y,z)$ at that point.
The form $dx$ is a density of planes of constant $x$ (i.e. $yz$-planes) such that there is 1 full plane in a unit of $x$. In other words, the line segment $(t,0,0)$ for $t=a$ to $t=b$ crosses $b - a$ planes (note that the orientation matters). As with a point density, it's not that we cross a plane discretely at $x=0, x=1, x=2$, rather, the $yz$-planes are distributed uniformly along the $x$-axis. For a curve $\gamma : [0,1] \rightarrow \mathbb{R^3}$ we can count the number of $dx$ planes it crosses. This is denoted $\int_\gamma dx = x(\gamma(1)) - x(\gamma(0))$. A general form $\alpha = A dx + B dy + C dz$ can be visualised as a density of surfaces with general orientation. The integral $\int_\gamma \alpha$ counts the number of times the curve $\gamma$ pierces a surface in the surface density. In other words, it counts the intersections of $\gamma$ with the surface density.
For a function $f : \mathbb{R}^3 \rightarrow \mathbb{R}$ the form $df$ represents surfaces of constant $f$. Viewing $x$ as a function that gives the $x$ coordinate for a point, you can see that $dx$ corresponds to planes of constant $x$. If $r = \sqrt{x^2 + y^2 + z^2}$ then $dr$ are the surfaces of constant $r$, i.e. spheres centered at the origin.
A 2-form $dx \wedge dy$ represents lines in the $z$-direction. Lines in the $z$-direction are formed by the intersection of a plane of constant $x$ and a plane of constant $y$, i.e. lines in the $z$-direction are lines of constant $x$ and $y$. For functions $f$ and $g$, the form $df \wedge dg$ represents curves of constant $f$ and $g$. For two general 1-forms $\alpha$ and $\beta$, which represent densities of surfaces, the form $\alpha \wedge \beta$ represents the density of curves formed by the intersection of those surfaces. The form $dx \wedge dy \wedge dz$ is the point density of intersecting the lines $dx \wedge dy$ in the $z$-direction with the $xy$-planes $dz$.
Given a parameterised surface $A : \mathbb{R}^2 \rightarrow \mathbb{R}^3$, the integral $\int_A \alpha$ of a 2-form $\alpha$ is the number of times the lines of $\alpha$ intersect the surface $A$.
The operation $d$ forms the boundary of the density of curves/surfaces/volumes. For example, of $\alpha$ is a 2-form representing a collection of curves, $d\alpha$ represents the collection of endpoints of those curves. We can understand the formula $d(df) = 0$; the curves $df$ of constant $f$ have no endpoints. The boundary of a density of volumes is a density of surfaces, the boundary of a density of surfaces is a density of curves, the boundary of a density of curves is a density of points, the boundary of a density of points is zero. Let's understand the form $x dy \wedge dz$. Visualize this as small lines of constant $y,z$ i.e. lines in the x direction. The lines have density $x$ near a point $x,y,z$. This can only be accomplished if the line density has a collection of boundary points. As we move further along $x$, the density of lines in the $x$ direction gets higher, and those lines have to start somewhere. Indeed, we see that $d(x dy \wedge dz) = dx \wedge dy \wedge dz$. The collection of lines $x dy \wedge dz$ has a uniform density of start points. On the other hand, $y dy \wedge dz$ has no net start points. This is just a collection of lines in the $x$ direction that gets denser as we move in the $y$ direction, but those lines still go on from $x = -\infty$ to $x = \infty$. Indeed, we see that $d(y dy \wedge dz) = 0$.
In summary, on an $n$-manifold
We can also intuitively understand the general Stokes theorem $\int_M d\alpha = \int_{\partial M} \alpha$. Let's consider a function $f$ on $\mathbb{R}^2$. The form $df$ represents the countours of constant $f$ (think of a contour plot), with a density such that there are net $b - a$ contours between the countour $f(x,y) = a$ and $f(x,y) = b$. Now consider a curve $\gamma$ with endpoints. The integral $\int_\gamma df$ calculates the number of contours crossed by $f$. The integral $\int_{\partial \gamma} f$ is just $f(\gamma(1)) - f(\gamma(0))$, i.e. $f$ evaluated along the boundary of $\gamma$ (with the appropriate orientation). We can see that these two integrals are equal: the number of countours crossed by $\gamma$ is precisely the height difference between the start and endpoint.
Now let $V$ be a volume in $\mathbb{R}^3$ with boundary $\partial V$, and let $\alpha$ be a 2-form representing a density of curves. The integral $\int_V d\alpha$ counts the number of endpoints sitting inside $V$. The integral $\int_{\partial V} \alpha$ counts the number of curves piercing through the boundary $\partial V$. You can intuitively understand that these are equal: the curve emanating from an endpoint must either have its other endpoint inside $V$, in which case this part of the curve density does not contribute as one endpoint is positive and the other negative, or the curve has its other endpoint outside $V$, in which case it must pierce its boundary.
The Stokes theorem for a surface in $\mathbb{R}^3$ is a bit tricky to describe with words, but drawing a picture will convince you.
You can also use this picture to understand the pullback, Poincaré lemma, the formula for $d(\alpha \wedge \beta)$, the degree formula, Poincaré duality, and so on.
One last point: why is there this weird inversion of dimension that a $k$-form represents a density of $(n - k)$ submanifolds? Why not use $r$-vectors to represent densities of $r$ submanifolds rather than $n - r$ covectors (which is what differential forms are). In particular, why not use a normal vector field to represent a density of curves? The reason is that $r$ vectors do not have the best transformation properties. A point density on the plane $\mathbb{R}^2$ is something that assigns a real number to an area. The infinitessimal version of this is that for a 2-form $\alpha$ the quantity $\alpha(x,y)(u,v)$ counts the number of points in a small parallelogram spanned by the vectors $u,v$ at the point $x,y$. If we deform the plane then the parallelogram of the vectors will deform with it, and $\alpha(x,y)(u,v)$ stays constant. If we used $r$ vectors rather than $r$ covectors, we would only be invariant under isometries rather than general diffeomorphisms. Suppose we have a vector field on the plane and a curve $\gamma$ in the plane. There is no basis independent way to say how many times the curve intersects with the vector field. This question only makes sense up to some scale factor. Forms have a natural scale attached to them, because a form $df$ naturally eats a tangent vector $y'(t)$ as $df(\gamma(t))(\gamma'(t))$. To do this with the vector field you'd have to choose a basis/coordinate system.
I hope that helps.