Let's see. When $Z = \{x\}\subset X$ is a point, then $f$ is automatically one-to-one on $Z$, and the hypothesis that $df_x\colon T_xX\to T_{f(x)}Y$ is an isomorphism is precisely the hypothesis of the inverse function theorem. You then conclude that $f$ maps a neighborhood $U$ of $x$ diffeomorphically to a neighborhood $f(U)$ of $y=f(x)$.
You are going to apply Exercise 5 to an appropriate neighborhood $U$ of $Z$. Provided you show $f$ is one-to-one on $U$, you'll deduce that $f$ maps $U$ diffeomorphically to $f(U)$. (You will know $f$ is a local diffeomorphism if you start by taking $U$ to be a union of open sets around points $x\in Z$ on which $f$ is a local diffeomorphism. You will then likely have to shrink $U$ a bit to make sure $f$ is one-to-one on $U$.)
Finally, here's a hint for the hint. Consider $U_i = \{x\in X: d(x,Z)<1/i\}$. If $f$ fails to be one-to-one for every $i\in\Bbb N$, there are points $a_i,b_i\in U_i$ with $f(a_i)=f(b_i)$. All these sets are contained in the compact set $\overline U_1$, and so, passing to convergent subsequences carefully, we may assume that $a_i,b_i\to z\in Z$. (Here you use the fact that $f$ is one-to-one on $Z$.) But what do you conclude from the fact that $df_z$ is an isomorphism?
OK, let's break the exercise into steps.
First, the "then $f$ maps $Z$ diffeomorphically onto $f(Z)$" part.
Since you have that $f$ in one-one on $Z$, and immersion at every $x\in Z$, you only need to prove that the map $f|_Z: Z \rightarrow f(Z)$ is a proper map, because then it would be an embedding, and an embedding is a diffeomorphism onto its image (pages 17, 18). Just take a compact $K \subset f(Z)$, since $f(Z)$ is given the subspace topology, it is also compact in $Y$, and since $Y$ is Hausdorff, it is closed in $Y$. Then since $f|Z$ is continuous as a map with image in $Y$ (just the restriction of a continuous map), $f^{-1}(K)$ is closed in $Z$, but $Z$ compact implies it is compact in $Z$. So $f$ is also proper, hence a diffeomorphism.
Now, we need to prove that $f$ is a little more, is a diffeomorphism of an open neighborhood of $Z$ in $X$ onto an open neighborhood of $f(Z)$ in $Y$.
Why is it that it's enought to prove $f$ to be one-one is an open neighborhood of $Z$ in $X$ (Exercise 5)? Because Exercise 5 tells you that a local diffeomorphism that is one-one is actually a diffeomorphism. Suppose you find an open nighborhood $U$ of $Z$ in $Y$ with $f|_U$ one-one, we could (if necessary) intersect $U$ with a neighborhood $V$ of $Z$ such that $f|V$ is a local diffeomorphism (this $V$ you get it by just taking a finite union of neighborhoods in $X$ of points in $Z$ restricted to wich $f$ is a diffeomorphism, local condition). Then you can assume there's a $U$ neighborhood of $Z$ in $X$ such that $f|_U: U\rightarrow f(U)$ is a local diffeo and one-one, applying Exercise 5 we get that $f|_U$ is a diffeomorphism.
There are only two things left to prove: Exercise 5, and the injectivity in a neighborhood of $Z$ in $X$.
For the first one, just observe that if $f: A\rightarrow B$ is a one-one local diffeo, then exists $f^{-1}: f(A)\rightarrow A$ and both $f$, $f^{-1}$ are smooth by just using their local expression, the identity. Then $f$ is a diffeomorphism onto its image, that hence is going to be open in $B$.
For the second one, suppose $f$ is not one-one restricted to any neighborhood of $Z$ in $X$.
Define $U_n=$ {$x \in X| d(x,Z)<\frac{1}{n}$} (your manifold X is sitting inside some $\mathbb{R}^k$ by definition, and if you don't take Guillemin's convention, just use Whitney's Embedding Theorem). There are {$a_n$}, {$b_n$} such that $a_n, b_n \in U_n$ and $f(a_n)=f(b_n)$. Clearly $d(a_n, Z), d(b_n, Z)$ tend to zero. Since the distance is continuous and $Z$ is closed in $X$ (again $Z$ compact inside $X$ Hausdorff is closed), then there are $z_1, z_2\in Z$ with $z_1=\lim_{n\to\infty} a_n$, $z_2=\lim_{n\to\infty} b_n$. Since $f$ is continuous, $f(z_1)=f(z_2)$. This contradicts the injectivity of $f$ restricted to $Z$ if $z_1\neq z_2$. So that $z_1=z_2=z$.
Now that contradicts the fact that $df_z$ is an isomorphism, since this implies that there's a neighborhood of $z$ in $X$ restricted to which $f$ is a diffeomorphism, in particular one-one.
Best Answer
Let $\ f:U \to V$ be such a diffeomorphism. Let $x=f^{-1}(0)$. For $h$ near $x$ we have $f(h)=0+Df_x(h-x)+e(h-x)$ where $\frac {e(h-x)}{|h-x|} \to 0$ as $|h-x| \to 0$. As $f$ is a diffeomorphism, $Df_x$ is a linear isomorphism (since $D(f^{-1})_{f(x)}=(Df_x)^{-1}$ by differentiating $ f^{-1}\circ f =I$ using the chain rule).
Let $V=(Df_x)^{-1}(-1,0,0,\dots,0)$, and $h=x+\frac1NV $ with $N$ so that $h\in U$ and $$\frac{|e(\frac1NV)|}{|\frac1NV|}<\frac1{|V|}$$ Now $f(h)=0+(-\frac{1}N,0,0,\dots,0)+e(\frac1NV)$. As $|e(\frac1NV)|<\frac1N$, $f(h)$ is not in $\Bbb H^k$. So no such diffeomorphism could exist.