I'm going to get an excursus that is much more complicated than you actually need for your case, where basically the dimension is $1$. However, I think that you need the following to better understand what is going on behind the "mumbo jumbo" formalism of $\operatorname{d}x, \operatorname{d}y$ and so on.
Get a linear space $V$ of dimension $n\in\mathbb{N}$ and a base $\{e_1,...,e_n\}$. You know from the linear algebra course that there exists a unique (dual) base $\{\varphi_1,...,\varphi_n\}$ of the dual space $V'$ such that:
$$\forall i,j\in\{1,...,n\}, \varphi_i(e_j)=\delta_{i,j}.$$
Get back in $\mathbb{R}^n$ and let $\{e_1,...,e_n\}$ be its standard base. Then you define $\{\operatorname{d}x_1,...,\operatorname{d}x_n\}$ as the dual base of $\{e_1,...,e_n\}$.
Then you need the concept of the differential of a function: if $\Omega$ is an open subset of $\mathbb{R}^n$ and $f :\Omega\rightarrow\mathbb{R}$ and $x\in\Omega$, you will say that $f$ is differentiable in $x$ if there exists a linear map $L:\mathbb{R}^n\rightarrow \mathbb{R}$ such that $$f(y)=f(x)+L(y-x)+o(\|y-x\|_2), $$
for $y\rightarrow x$, where $\|\|_2$ is the Euclidean norm in $\mathbb{R}^n$. Also, you will say that $f$ is differentiable if it is differentiable in $x$ for each $x\in\Omega$.
You can prove that if $f$ is differentiable, then for each $x\in\Omega$ the linear map $L$ is unique, in the sense that if $M$ is another linear map that do the same job, then $M=L$. So you are in position to define the differential of $f$ in $x$ as the linear map $L$. In general, when you change the point $x$, also the differential of $f$ in $x$ changes, so you define a map:
$$\operatorname{d}f: \Omega\rightarrow (\mathbb{R}^n)'$$
that at each $x\in\Omega$ associates the differential of $f$ in $x$. This map is called the differential of $f$.
Now, fix a differentiable $f :\Omega \rightarrow \mathbb{R}$. Then $\forall x\in\Omega, \operatorname{d}f(x)\in (\mathbb{R}^n)'$ and so, being $\{\operatorname{d}x_1,...,\operatorname{d}x_n\}$ a base for $(\mathbb{R}^n)'$, there exist $a_1:\Omega\rightarrow\mathbb{R},..., a_n:\Omega\rightarrow\mathbb{R}$ such that:
$$\forall x \in \Omega, \operatorname{d}f(x)=a_1(x)\operatorname{d}x_1+...+a_n(x)\operatorname{d}x_n.$$
You can prove that
$$\frac{\partial{f}}{\partial{x_1}}=a_1,...,\frac{\partial{f}}{\partial{x_n}}=a_n$$
where $\frac{\partial{f}}{\partial{x_1}},...,\frac{\partial{f}}{\partial{x_n}}$ are the partial derivatives of $f$.
So, you have:
$$\forall x \in \Omega, \operatorname{d}f(x)= \frac{\partial{f}}{\partial{x_1}}(x)\operatorname{d}x_1+...+\frac{\partial{f}}{\partial{x_n}}(x)\operatorname{d}x_n.$$
Now, you define a differential form to be any function:
$$F :\Omega \rightarrow (\mathbb{R}^n)'$$
so, in particular, the differential of a differentiable map is a differential form.
You will learn during the course that you can integrate continuous differential form along $C^1$ curves. Precisely, if $\gamma :[a,b] \rightarrow \Omega$ is a $C^1$ function and $F :\Omega \rightarrow(\mathbb{R}^n)'$ is a differential form, then you define:
$$\int_\gamma F := \int_a ^ b F(\gamma(t))(\gamma'(t))\operatorname{d}t,$$
where the right hand side is a Riemann integral (remember that $F(\gamma(t))\in(\mathbb{R}^n)'$ and that $\gamma'(t)\in\mathbb{R}^n$, so $F(\gamma(t))(\gamma'(t))\in\mathbb{R}$).
Now, it can be proved that if $f$ is a differentiable function whose differential is continuous, then:
$$\int_\gamma\operatorname{d}f = f(\gamma(b))-f(\gamma(a)).$$
Finally, we come back to earth. In your case, you have that $n=1$. So let's interpret the equation
$$\frac{\operatorname{d}y}{\operatorname{d}x} = f(x,y)$$
in the context of differential formalism developed above:
- $\{\operatorname{d}x\}$ is the dual base in $(\mathbb{R})'$ of the base $\{1\}$ in $\mathbb{R}$;
- $y$ is a function, say from an open interval $I\subset\mathbb{R}$, i.e. $y:I\rightarrow\mathbb{R}$;
- $\operatorname{d}y$ is the differential of the function $y$, and then $\operatorname{d}y : I \rightarrow (\mathbb{R})'$;
- Then, as we stated before (see the section about partial derivatives), it holds that the derivative of $y$, i.e. $y'$, satisfies $\forall x\in I, \operatorname{d}y(x) = y'(x)\operatorname{d}x$. Here, the expression $\frac{\operatorname{d}y}{\operatorname{d}x}$ is just a name for $y'$, so, keeping that in mind, $\forall x\in I, \operatorname{d}y(x) = \frac{\operatorname{d}y}{\operatorname{d}x}(x)\operatorname{d}x$;
- $f : I\times \mathbb{R}\rightarrow \mathbb{R}$ is a function, and we want that $\forall x \in I, \frac{\operatorname{d}y}{\operatorname{d}x}(x) \doteq y'(x) = f(x,y(x))$;
- So you want that $\forall x \in I, \operatorname{d}y(x) \overset{(4)}{=} \frac{\operatorname{d}y}{\operatorname{d}x}(x)\operatorname{d}x \overset{(5)}{=} f(x,y(x)) \operatorname{d}x$ (notice that this is an equation in $(\mathbb{R})'$);
- Now, get an interval $[a,b]\subset I$ and integrate the differential form along the curve $\gamma :[a,b]\rightarrow I, t\mapsto t$. On one hand you get: $$\int_\gamma \operatorname{d}y = \int_a ^b \operatorname{d}y(\gamma(t))(\gamma'(t))\operatorname{d}t = \int_a ^b y'(t)\operatorname{d}t = y(b)-y(a),$$
and on the other hand: $$\int_\gamma \operatorname{d}y = \int_\gamma (x\mapsto f(x,y(x)))\operatorname{d}x = \int_a ^b f\left(\gamma(t),y(\gamma(t))\right)\operatorname{d}x(\gamma' (t))\operatorname{d}t = \int_a ^b f(t,y(t))\operatorname{d}t,$$
and so: $$y(b)-y(a) = \int_a ^b f(t,y(t))\operatorname{d}t.$$
Best Answer
newbie mentioned Peter Olver's "Applications of Lie Groups to Differential Equations", which is a good book. An alternative is Sattinger and Weaver's "Lie Groups and Algebras, with Applications to Physics, Geometry, and Mechanics". I didn't know anything about Lie groups before I read those two books, so it's certainly possible to learn them that way.