It's useful in several ways. When you are talking about topological (i.e. open or closed) properties of the diagonal, you are talking about separating points of $X$.
For example, consider 7). Unramifiedness says that you have no 'sheets' coming together in your map. For example this is the sort of picture that we want to avoid. That said, we can make sense of this in terms of points of $X$ being close relative to $Y$. Namely, what goes wrong in this picture at a point of ramification (for example, the one the top right)? We have two sheets 'coming together'. We can imagine choosing pairs of points on these sheets $(x,y)$ ($x$ in the top sheet and $y$ in the bottom sheet) that map to the same point of $Y$ under $f$ (i.e. so that in the picture $x$ lies directly above $y$). We then have that $(x,y)$ is a point of $X\times_Y X$. We see then that by choosing points $(x,y)$, as above, approaching the cusps, we have points $(x,y)\in X\times_Y X-\Delta_{X/Y}$ converging to the cusp point. But, this cusp point $p$ is just $(p,p)$--it's in the diagonal! So, we see that $X\times_Y X-\Delta_{X/Y}$ is not closed, and so $\Delta_{X/Y}$ is not open.
To summarize this argument, we saw that 'sheets coming together' could be phrased in terms of 'pairs of distinct points of $x$ (in the same fiber) converging to a pair of points in the same fiber which are NOT distinct'. As soon as we see a phrasing in terms of 'closeness of points relative $Y$' we should be keyed into the appearance of $\Delta_{X/Y}$ somewhere in this picture. Explicitly it shows up as in the last paragraph.
Let's look at another example: separatedness. Namely, when do we want to think of a map $f:X\to Y$ as being 'separated'. A space $X$ should be separated relative to $Y$, if any two points of $X$ can be made to look different, by separating by opens, with regards to $Y$. In other words, if $x,y\in X$ and $f(x)\ne f(y)$, then relative to $Y$, $x$ and $y$ look different. So, I should be able to separate these elements by an open set. This use of 'closeness of points of $X$' should key us into the fact that the diagonal, is again, going to make an appearance. And, in fact, I think you should be easily able to convince yourself that $(x,y)\in X\times_Y X-\Delta_{X/Y}$, and that separating them by neighborhoods amounts to finding an open $(x,y)\in U\subseteq X\times_Y X-\Delta_{X/Y}$.
That said, sometimes you are not talking about topological properties of $\Delta_{X/Y}$, sometimes you are demanding things like your 4) (this is often called radicile). It's clear intuitively that being injective should be the same thing as the diagonal being surjective. Indeed, what would it mean for a pair $(x,y)\in X\times_Y X-\Delta_{X/Y}$ it would mean that $x\ne y$, but that $f(x)=f(y)$! That said, to understand why surjectivity of the diagonal being equivalent to radicile, we need to be a little more careful. In the above, I was recklessly conflating $X\times_Y X$ with the topological fiber product. This is OK for intuition, but not ok for technical questions like 4).
Namely, let's think about what the locally closed embedding $\Delta_{X/Y}\to X\times_Y X$ does. Well, putting our Yoneda caps on, this map is just the map which to every $Y$-scheme $S$ assigns
$$\Delta_{X/Y}(S)=\{(x,y)\in X_S(S)\times X_S(S):x=y\}\subseteq (X\times_Y X)(S)=\{(x,y)\in X_S(S)\times X_S(S):f_S(x)=f_s(Y)\}$$
In other words, $\Delta_{X/Y}\to X\times_Y X$ does not just see what's happening in $f:X\to Y$ but in all extensions $f_S:X_S\to S$. Thus, we see why $\Delta_{X/Y}\to X\times_Y X$ being surjective is so much stronger than just $f$ being injective--this map encapsulates all of the base-changes! In general (but not always!), the intuition for topological spaces/sets can be brought to bear on scheme theoretic things if one keeps in mind that these properties are remembered also by most base-changes.
All in all, as the above two examples show, and as you should probably figure out for yourself in the rest of the examples, the utility of $\Delta_{X/Y}$ is clear. Namely, $X\times_Y X$ gives us a way of talking about points of $X$ relative to $Y$ in scheme-theoretic language. The diagonal allows us to talk, more or less, about single points of $X$, but built into this larger framework (of the full $X\times_YX$--pairs of points). The utility of this should be clear from the above examples, whether we want to talk about separating points, or about points mapping to the same point.
You can observe that
$O_X(X)\cong A$
where $O_X(X)$ is the Space of the global section of $X$.
Then if you have a morphism of scheme
$(f,f^*): (X,O_X)\to (Y, O_Y)$
you get a morphism
$f^*(A): O_X(X)\cong A\to (f_*O_Y)(A)\cong B$
This is the unique morphism which induces exactly $(f,f^*)$.
In fact if $\phi: A\to B$ is a morphism of ring which induces $(f,f^*)$, then for each $a\in A$, if we denote with $a^\sim\in O_X(X)$ the global constant section of $a$, we get
$f^*(a^\sim)=\phi(a)^\sim$
so, up to isomorphism of $O_X(X)\cong A$, one get
$f^*(a)=\phi(a)$
Best Answer
Since being a monomorphism is preserved by base change, we can base change along $\operatorname{Spec} k(f(x))\to S$ and assume that $S$ is the spectrum of a field. Next, since open immersions are monomorphisms, we may replace $X$ by an affine open neighborhood of $x$ and assume $X$ is affine. Thus we've reduced to the case where $\operatorname{Spec} R\to \operatorname{Spec} k$ is a monomorphism, which implies that it's diagonal map is an isomorphism. But $R\to R\otimes_k R$ can only be an isomorphism if $\dim_k R \leq 1$, and as $R$ admits a nonzero map to $k(x)$, it is nonzero and we have the result.