For the first correspondence note, that by the inner product on $\mathbb R^3$ we have a 1-1-correspondence of linear forms $\mathbb R^3 \to \mathbb R$ and vectors in $\mathbb R^3$ where $v \in \mathbb R^3$ corresponds to $w \mapsto \left<v,w\right>$. This corresponcence - applied pointwise - assiociates to a vector field $\sum_{i} f_i U_i$ (where $U_i$ denote the constant orthogonal coordinate frame?) the form $\sum_i f_i \, dx_i$.
For the second one, we note that given a 2-form $\omega$, we have a map from 1-forms to 3-forms given by $\eta \mapsto \omega \wedge \eta$, as the 3-forms are a module of rank one over the functions, i. e. each three form is of the form $f\, dx_1\,dx_2\, dx_3$, we have a map from 1-forms to functions, that is a functional on the 1-forms, which can be represented (the bidual of the functions are the functions) by a vector field. Now condsider the 2-form
$$ \omega = f_1\, dx_2 \,dx_3 - f_2\, dx_1 \, dx_3 + f_3 \, dx_1\, dx_2 $$
We have
\begin{align*}
\omega \wedge dx_1 &= f_1\; dx_1\, dx_2\, dx_3\\
\omega \wedge dx_2 &= -f_2\; dx_2\, dx_1\, dx_3\\
&= f_2\; dx_1\, dx_2\, dx_3\\
\omega \wedge dx_3 &= f_3\; dx_1\, dx_2 \, dx_3
\end{align*}
so $\omega$ acts on the 1-forms in the same way as $\sum_i f_i\, U_i$ does.
A 2-form is a function that eats a parallelogram (technically it eats 2 vectors, which you should think of as spanning a parallelogram) and spits out a number proportional to its area. A 3-form eats a parallelepiped (the 3-dimensional analog of a parallelogram) and spits out a number proportional to its volume. A 4-form eats a 4-dimensional parallelotope and spits out a number proportional to its hypervolume. A 1-form eats a line segment (which you can think of as a 1-dimensional parallelogram) and spits out a number proportional to its length. A 0-form eats a single point (which you can think of as a 0-dimensional parallelogram) and spits out a number, though there's nothing for it to be proportional to since a point has no extension in space. I think you get the picture. In general an n-form eats n vectors, which you should think of as spanning an n-dimensional parallelotope, and spits out a number proportional to its hypervolume.
Usually books that teach differential forms obscure this. They will define an n-form as a "real-valued multilinear, skew-symmetric function of n vectors". But it means the same thing. Multilinearity and skew-symmetry = output is proportional to length/area/volume/hypervolume. The determinant, which is used to compute the volume of a parallelepiped (and its higher and lower dimensional analogs), has the same two properties.
So why do we require forms to have this property? Well it's just because it's needed for integration. Imagine a curve you want to integrate over. The first step is to approximate it with line segments. Then you apply some function to each line segment in order to get a number. You need that number to shrink as the size of the line segment shrinks otherwise the sum won't converge. Think about it, if the output of the function was independent of the length of the input, then as more segments were added to the approximation the sum would just shoot up to infinity. Now think of a surface you want to integrate over. You can approximate it with parallelograms, imagine the scales of an armadillo. Then for each parallelogram you apply some function that spits out a number. We need the numbers to shrink as the scales do so the sum actually converges. If you want to integrate over some 3-dimensional volume, approximate it with parallelepipeds and again evaluate a function for each parallelepiped. The output of this function needs to shrink with its input for the sum to converge. These functions that we integrate over curves/surfaces/volumes/hypervolumes are forms.
Now let me explain why you write forms as linear combinations of elementary forms. It has to do with the generalized Pythagorean theorem, which I'll just call the GPT. In the same way that the length of a line segment is equal to the sum of the squared lengths of its projections onto the various coordinate axes, the area of an arbitrary parallelogram is equal to the sum of the squared areas of its projections onto the various coordinate planes. And the volume of a parallelepiped is equal to the sum of the squared volumes of its projections onto the various 3-dimensional subspaces. And so on. So the Pythagorean theorem applies to more than just line segments.
So let's look at the example of a 1-form that eats line segments embedded in 3-dimensional space. In general it's gonna look like $adx + bdy + cdz$ (if you forgot, $dx$, $dy$, and $dz$ are just functions that eat a line segment and spit out its projections on the x axis, y axis, and z axis respectively). All that's happening is you're taking the dot product of a vector $(a,b,c)$ with another vector $(dx,dy,dz)$ which equals the projection of $(a,b,c)$ onto $(dx,dy,dz)$ times the length of $(dx,dy,dz)$ (the length of $(dx,dy,dz)$ is $\sqrt{dx^2 + dy^2 + dz^2}$ ie the length of the line segment by the GPT). In other words $adx + bdy + cdz$ is literally just another way of writing: (projection of $(a,b,c)$ onto $(dx,dy,dz)$) times (length of the line segment). Since the length of the line segment is a factor in this product, the function is obviously proportional to the length of the line segment. Any 1-form can be written like this.
Another example: A 2-form that eats parallelograms embedded in 3-dimensional space is gonna have the form $a(dx \wedge dy) + b(dx \wedge dz) + c(dy \wedge dz)$ (if you forgot, $dx \wedge dy$, $dx \wedge dz$, and $dy \wedge dz$ are just functions that eat parallelograms and spit out the areas of their projections on the xy, xz, and yz planes respectively). So this is just another way of writing the dot product of $(a,b,c)$ and $(dx \wedge dy, dx \wedge dz, dy \wedge dz)$ which is just the projection of $(a,b,c)$ onto $(dx \wedge dy, dx \wedge dz, dy \wedge dz)$ times the length of $(dx \wedge dy, dx \wedge dz, dy \wedge dz)$ (which is $\sqrt{(dx \wedge dy)^2 + (dx \wedge dz)^2 + (dy \wedge dz)^2}$ ie the area of the parallelogram by the GPT). In other words the linear combination is just equal to: (projection of $(a,b,c)$ onto $(dx \wedge dy, dx \wedge dz, dy \wedge dz)$) times (area of the parallelogram). Which is clearly a function proportional to the area of the parallelogram.
Another example: A 2-form that eats parallelograms in the plane. It has the general form $a(dx \wedge dy)$. You only need one term because $dx \wedge dy$ already gives you the area of the parallelogram. In the same way $dx$ gives you the length of your line segment if you're only in 1 dimension. It's only when you're in a dimension higher than the dimension of the line segment/parallelogram/parallelepiped/parallelotope that you're gonna have to invoke the GPT ie have a linear combination of multiple elementary forms.
So hopefully you see that differential forms are actually very simple objects. They're merely generalized integrands. Other things in exterior calculus like the exterior derivative, the generalized stokes theorem, etc are similarly very simple when explained properly.
edit: a slightly cleaned up version of this post with some pictures can be found here: https://simplermath.wordpress.com/2020/02/13/understanding-differential-forms/
Best Answer
In my opinion, a lot of these relationships are suggested by abusive notation, abuses that hide what's really going on.
Don't get me wrong: some abuses of notation are harmless, or at the least, they help people get going on doing calculations. But they should still be understood to the fullest degree for those who wish to go beyond merely doing calculations.
I'll give an example: consider the relationship,
$$\frac{dx}{dy} = \frac{1}{\frac{dy}{dx}}$$
You probably know that differentials shouldn't really be divided, that this notation is really only suggestive, and while what it says is true by the inverse function theorem, it does so in a voodoo-like way that doesn't stand up to closer inspection, raising more questions than answers.
Of course, there's a totally reasonable way to phrase this notion: as I said, it's the inverse function theorem. Given a function $f$ on a vector $x$, we have the Jacobian $J_f$, and we know that
$$J_{f^{-1},f(x)} = J_{f,x}^{-1}$$
Which is a totally rigorous, though perhaps less obviously useful, statement.
(You might be thinking that nonstandard analysis could be useful here. Perhaps it would be, but my point is a bit larger: to understand and feel comfortable with the statement, you need to either take for granted that it stands in for something else, or accept that you need more math to understand it the way it's written.)
So, how does this relate to differentials and differential forms?
Well, mostly through the use of $d$ to denote the exterior derivative. Changing this symbol reveals how manifestly nonsensical some apparent relationships are.
For the purposes of this answer, I'll denote the exterior derivative by $\nabla$. This is reasonably familiar to students of vector calculus in 3d, and most of the results can be used directly from there.
Let's address your point (1), the total differential. It would be written as,
$$\nabla f = (\partial_i f) \nabla x^i$$
Again, recognizing the connection between the exterior derivative and the gradient from vector calculus, you should realize that the $\nabla x^i$ are nothing more than a set of basis vectors (more exactly, basis covectors), and all this does is decompose the gradient of $f$ into some coordinate directions. There is no explicit connection here between the gradient and differentials.
Let's talk about point (2), integrals around curves.
This is a common misconception from people who work with differential forms. I'll point out that the quantity $r'(t) = (x', y', z')(t)$ is manifestly a tangent vector. It literally points tangent to the curve that is the domain of integration, and fundamentally, it obeys quite different transformation laws than any form.
Moreover, if $F$ is a one-form, then it should be written
$$F = F_x \nabla x + F_y \nabla y + F_z \nabla z$$
If all the supposed $dx$'s are coming from the form, then what's coming from the $dl$? As argued above, what comes with $dl$ is not a set of basis forms but a vector, the tangent vector to the curve. Writing this vector $\ell'(t) = x' \partial_x r + y' \partial_y r + z' \partial_z r$ (where $r$ is a vector), we get for the dot product,
$$\int F \cdot dl = \int (F_x \circ l)(t) x'(t) \nabla x \cdot \partial_x r + \ldots \, dt$$
Of course, $\nabla x \cdot \partial_x r = 1$ by definition--otherwise, the basis forms would not be dual to the basis vectors. What would happen if we wrote the basis forms with the usual $dx$ notation?
$$\int F \cdot dl = \int (F_x \circ l)(t) x'(t) dx(\partial_x r) + \ldots \, dt$$
On its face, this looks like gobbledy-gook. Even if you had the presence of mind to distinguish between a basis form $dx$ and a differential denoting the variable of integration $dt$, it would be challenging to reconcile how these two notions should coexist in the same integral. I know I've met one person on this very site who suggested that no one should ever work with $dx$ and the like because you're just going to pull back anyway, so only $dt$ should be viewed as a differential form on this curve. That's...certainly one way of looking at things. To me, that comes at a high price of not being able to look at things geometrically. Let me explain:
What are you doing when you pull back a form in an integral like this? You're making it so the tangent vector in the target space has constant direction and magnitude (since you're pulling back to a 1d vector space, the image of the tangent vector is just the trivial unit vector). This is what's commonly done for form integrals, because then all your complexity is in the form, and in the Jacobian transforming that form, rather than in considering the components of the tangent vector. For this reason, the tangent vector is sometimes forgotten or neglected, since once you've pulled back, it's some trivial constant vector that will just be eaten by the form anyway. All that remains to be done is to set some convention for what direction it should be: positive or negative.
Anyway, you could call a basis form on that space by name, and perhaps some people would call it $dt$. If that abstract way of thinking works for you, do what you feel is best.
Finally, let's talk about point (3): this is more of a geometric interpretation question, and it's not unique to differential forms. Should a vector field be viewed as small, directed lines at every point? This is certainly behind the notion of field lines, which are commonly used for electric fields. I'm not sure I could say one (vectors) is more differential than the other (forms). Both involve orientations and magnitudes. In the end, I have to offer the same perspective as I would for vectors: does it make sense to think of a vector as a small piece of a line? If so, how would you decide that differentials are associated with forms instead of vectors? If not, how is this different from what you've done with forms?
Let me not digress for too long. There's a reason the notation for differential forms has stuck around as long as it has: it's enormously suggestive, and for dealing with unfamiliar concepts, suggestive notation is powerful. But like with the inverse function theorem, I submit that that notation is merely suggestive, full of shortcuts and sleight of hand. I do not think differential forms turn infinitesimals rigorous--far from it, I think that a far stronger relationship between forms and these differentials in integrals is suggested by the notation in ways that it shouldn't be.