I thought exactly the same and I think I've found a satisfactory solution. Basically your po\text{int } 1. is right, but the argument is indeed quite subtle.
I'm gonna divide this proof in 4 parts:
Part 1: If $M$ is a nonempty manifold with boundary then $\text{int }(M)\neq \emptyset$.
Proof: I'll leave this one as homework :) because it's not subtle.
Part 2: If $M\to N$ are nonempty diffeomorphic smooth manifolds with boundary with dimensions $m$ and $n$ then $m=n$.
Proof: This part is subtle, so I'll write the details carefully. Let $f:M\to N$ be a diffeomorphism. Then $f|\text{int } M:\text{int } M\to f(\text{int } M)$ is a diffeomorphism where $\text{int } M$ is a nonempty smooth manifold (this is because of part 1.) and $f(\text{int } M)$ is a nonempty smooth manifold with boundary. Let's rename $f|\text{int } M:\text{int } M\to f(\text{int } M)$ as $f_1:M_1\to N_1$. Using the same trick $f_1|f_1^{-1}(\text{int } N_1):f_1^{-1}(\text{int } N_1)\to \text{int } N_1$ is a diffeomorphism between nonempty smooth manifolds. By Proposition 2.17 (Diffeomorphism Invariance of Dimension) we have that $f_1^{-1}(\text{int } N_1)$ and $\text{int } N_1$ have the same dimension, so the same is true for $M$ and $N$.
Part 3: If $M$ is a smooth manifold with boundary, $(U,\phi)$ is a smooth chart for $M$ and $f:\phi(U)\to V$ is a diffeomorphism between open subsets of $\mathbb{H}^n$ or $\mathbb{R}^n$ then $(U,f\circ \phi)$ is a smooth chart for $M$.
Proof: I'll also leave this one for homework because it's also not subtle.
Part 4: (Theorem 2.18, Diffeomorphism Invariance of the Boundary): Suppose $M$ and $N$ are smooth manifolds with boundary and $F:M\to N$ is a diffeomorphism. Then $F(\partial M)=\partial N$.
Proof: Let $p\in \partial M$. This means there is a smooth chart $(U,\phi)$ for $M$ such that $p\in U$, $\phi(U)\subseteq \mathbb{H}^n$ and $\phi(U)\in \partial \mathbb{H}^n$. By restricting $U$ (this is essentially what Lee does in his proof of Theorem 2.17, Diffeomorphism Invariance of Dimension) we may find a chart $(F(U),\psi)$ for $N$. Then $\psi\circ F\circ \phi^{-1}:\phi(U)\to \psi(F(U))$ is a diffeomorphism between open subsets of the same $\mathbb{H}^n$ or $\mathbb{R}^n$ because of part 2. Define $\hat{F}=\psi\circ F\circ \phi^{-1}$, then by part 3. $(U,\hat{F}\circ \phi)$ is a smooth chart for $M$.
By Theorem 1.46 (Smooth Invariance of the Boundary) $\hat{F}(\phi(U))\subseteq \mathbb{H}^n$ and $\hat{F}(\phi(p))\in \partial \mathbb{H}^n$, this is the same as saying $\psi(F(U))\subseteq \mathbb{H}^n$ and $\psi(F(p))\in \partial \mathbb{H}^n$, which implies $F(p)\in \partial N$. So $F(\partial M)\subseteq \partial N$ and we are done.
This is the kind of details that always drives me crazy, but hopefully now I can fill most of the gaps. When I ask people about this kind of details I sometimes get offtopic answers by people that think this is a trivial detail. For example, I don't see how connectedness plays a role here, care to explain @Ted Shifrin?
Best Answer
We have $U\subset Y$. Whenever $g:A\to B$ and $B\subset C$ and $h:C\to D$, when we write $h\circ g$, we really mean $h|_{B}\circ g$. And if $p:A\to C$ and $p(A) \subset B$, then we also often write $p$ for the function $p:A\to B$ which matches the other $p$ in the obvious way.
We are assuming $y\notin \partial Y$, so that $y$ has a neighborhood $U$ diffeomorphic to an open subset of $\mathbb{R}^n$ for some $n$. And $x$ has no such neighborhood, being a boundary point. But $f^{-1}(U)$ is such a neighborhood.