Let's momentarily ignore 3. and 4..
Hopefully you agree that any notion of convergence of distributions $\mu_n$ to a distribution $\mu$ would imply $\mu_n(A) \rightarrow \mu(A)$ for an appropriate class of $A$. In this case the equivalence of 1. and 5. should be natural. The requirement that $\mu(\partial A) = 0$ allows us to say $\delta_{1/n} \Rightarrow \delta_0$, where $\delta_x(dx)$ is the measure $\delta_x(A) = 1$ if $x \in A$ and $0$ otherwise.
Regarding 2., you should poke this definition for yourself a little bit. Compare with the definition of weak-$*$ convergence on the Banach space $X = C(S)$ for $S = [0, 1]$ and $S = [0, \infty)$ to get a feeling of this "weak" formulation of convergence. Keep in mind that the space of probability measures $\mathcal{P}(S)$ on a Polish space $S$ is not a vector space as it is not closed under addition. An equivalent statement of 2. is $\int f d\mu_n \rightarrow \int f d\mu$ for $f$ only bounded and continuous ($f\in C_b(S)$), and one can even relax this to $f \in C_b(S)$ which additionally vanish at infinity (aka $f \in C_0(S)$) under the assumption that $\mu$ is itself a probability measure (consider $\mu_n = \delta_n$). These puzzles are, as others have noted, up to you to think about.
If Portmanteau were just 1., 2. and 5., I suspect you would not have posted this question. What is confusing is that you can get away with the apparently "weaker" inequalities 3. and 4., and I interpret your question as "why am I not actually giving anything up?".
First, note that 3. and 4. are directly equivalent, simply by taking complements. Then, note that 3. and 4. together imply for any set $A$,
$$ \mu(A^\circ) \leq \liminf_n \mu_n(A) \leq \limsup_n \mu_n(A) \leq \mu(\bar{A}), $$
where $A^\circ$ and $\bar{A}$ are the interior and closure of $A$, respectively. In particular, if $\mu(\bar{A}) = \mu(A^\circ)$, then $\mu_n(A) \rightarrow \mu(A)$. If you recall that $\bar{A} = A^\circ
\cup \partial A$, you recover 5.
That is the "formal" explanation, but I would say you should look deeper at semicontinuity definitions. Consider, for instance, that a lower semicontinuous function has closed level sets and acheives its minimum on compact sets. These definitions are strong enough to guarantee some notion of convergence, but flexible enough that they extend to general contexts. Take a look at $\Gamma$-convergence or the Large deviations principle.
My point is these inequalities are useful to keep in mind in their own right, and the things to think about are how you would approximate the indicator function of an open set or a closed set. This is how the Portmanteau theorem is usually proved, and the best reference for weak convergence in general is Billingsley's book Convergence of Probability Measures.
The Helly-Bray theorem also holds for $\mathbb{R}^n$.
"$\Rightarrow$": Assume that $\mu_n \to \mu$ vaguely. By the Portmanteau theorem for vague convergence, $\mu_n(B) \to \mu(B)$ for all bounded $\mu$-continuity Borel sets $B \subseteq \mathbb{R}^n$. For $i = 1, \dots, n$ denote by $D_i \subseteq \mathbb{R}$ the set of continuity points of the marginal measure $\mu_i$ on $\mathbb{R}$. Then $D_i$ is countable and $C := D_1^c \times \dots \times D_n^c$ is dense in $\mathbb{R}^n$. For any point $u \in C$, the set $(-\infty, u]$ is a $\mu$-continuity set. Therefore, $u$ is a continuity point of $F$. Any rectangular box $(a, b]$ with $a, b \in C$ is a $\mu$-continuity set. Any corner $u$ of $(a, b]$ is contained in $C$.
With this in mind, let $x$ be a continuity point of $F$. We can decompose $(-\infty, x]$ into a countable collection of boxes $(a^k, b^k]$ with $a^j, b^j \in C$. Since all these boxes $(a^j, b^j]$ are $\mu$-continuity sets, we get
$F_n(x) = \sum_j \mu_n(a^j, b^j] \to \sum_j \mu(a^j, b^j] = F(x)$ by the bounded convergence theorem.
"$\Leftarrow$": Assume that $F_n(x) \to F(x)$ for all continuity points $x$ of $F$. For a box $(a, b]$ it holds $\mu(a, b] = \Delta^a_b F$ which is an alternating sum over values $F(x)$ with $x$ a corner of $(a, b]$. If $a, b \in C$ then all the corners of $(a, b]$ are contained in $C$ and since $F$ is continuous on $C$ we get $\mu_n(a, b] = \Delta^a_b F_n \to \Delta^a_b F = \mu(a, b]$. Let $g : \mathbb{R}^n \to \mathbb{R}$ be continuous with compact support. Then $\textrm{supp}(g) \subseteq (a, b]$ for some $a, b \in C$. Let $\varepsilon > 0$. Since $g$ is uniformly continuous on $(a, b]$ and $C$ is dense in $\mathbb{R}^n$ we can partition $(a, b]$ into finitely many boxes $(a^j, b^j]$, $j = 1, \dots, m$ with $a^j, b^j \in C$ such that $\sup_{x \in (a^j, b^j]} |g(x) - g(b^j)| < \varepsilon$ for all $j$. Decompose $\int g d\mu = \sum_j \int_{(a^j, b^j]} g d\mu$. We can approximate
$$\left|\int g d\mu - \sum_j g(b^j) \mu(a^j, b^j]\right| = \left|\sum_j \int_{(a^j, b^j]} (g(x) - g(b^j)) \mu(dx)\right| \\
\leq \sum_j \sup_{x \in (a^j, b^j]} |g(x) - g(b^j)| \mu(a^j, b^j] < \varepsilon \cdot \mu(a, b]$$
and similarly for all the $\mu_n$. It follows
$$\left|\int g d\mu_n - \int g d\mu\right| \leq \left| \int g d\mu_n - \sum_j g(b^j) \mu_n(a^j, b^j]\right| + \left| \sum_j g(b^j)(\mu_n(a^j, b^j] - \mu(a^j, b^j])\right| \\
+ \left| \int g d\mu - \sum_j g(b^j) \mu(a^j, b^j] \right| \leq 2\varepsilon + \lVert g \rVert \sum_j |\mu_n(a^j, b^j] - \mu(a^j, b^j]|.$$
As $n \to \infty$, the right-hand side converges to $0$ (the sum is finite) and we get
$\limsup_n |\int g d\mu_n - \int g d\mu| \leq 2 \varepsilon$. Since this is true for all $\varepsilon$, $\int g d\mu_n \to \int g d\mu$. Therefore, $\mu_n \to \mu$ vaguely.
Best Answer
Provided that you know that the set discontinuities $D_f$ of $f$ is measurable, then one may use the Portmanteau theorem as follows:
(i) Claim ($\mu_n\circ f^{-1}\stackrel{n}{\Longrightarrow}\mu\circ f^{-1}$: For any closed set $F\subset \mathbb{R}$, we have $$f^{-1}(F)\subset\overline{f^{-1}(F)}\subset D_f\cup f^{-1}(F)$$ If $\mu(D_f)=0$ then $\mu(f^{-1}(F))=\mu(\overline{f^{-1}(F)})$. By the Portmanteau theorem \begin{align} \limsup_n\mu_n (f^{-1}(F))\leq\limsup_n\mu_n(\overline{f^{-1}(F)})\leq \mu(\overline{f^{-1}(F)})= \mu(f^{-1}(F)) \end{align} This shows that $\mu_n\circ f^{-1}\stackrel{n}{\Longrightarrow}\mu\circ f^{-1}$.
Now, let $\phi(x)=((-M)\vee x)\wedge M$ where $M=\|f\|_u$. As $f=\phi\circ f$ and $\phi\in\mathcal{C}_b(\mathbb{R})$, by part (i) \begin{align*} \int f\,d\mu_n&=\int \phi\circ f\,d\mu_n=\int \phi \,d\mu_n\circ f^{-1} \xrightarrow{n\rightarrow\infty}\int \phi\,d\mu\circ f^{-1}=\int \phi\circ f\,d\mu=\int f\, d\mu. \end{align*}