Now that I understand the context better I am going to promote my own comments to an answer.
There is no general principle for artificially restricting the domain of a function when computing its convex conjugate. In fact, you should not do so unless you have a compelling and mathematically sound reason in your specific application. The domain of the conjugate function---that is, the set of $y$ for which the supremum is finite---is an important part of the conjugate itself, and should not be discarded. For example, when computing the dual of a convex optimization problem with an objective $f(x)$, the domain information imposes implicit constraints on the dual variables. If you remove that information, the dual problem is incorrect; it no longer provides bounds for the primal problem.
So what's going on in Chapter 5 of Ms. Fazel's thesis, or in the case of the so-called $\ell_0$ norm? Note that the interest is not in the conjugate of these functions, but rather in their convex envelopes. It just so happens that the convex envelope of a function can computed using the conjugate of the conjugate. Unfortunately, the convex envelopes of $f(x)=\mathop{\textbf{card}}(x)$ or $g(X)=\mathop{\textbf{rank}}(X)$ are not very interesting---in fact, I think they are identically zero. On other hand, the modified, extended-valued functions
$$\bar{f}(x) = \begin{cases} \mathop{\textbf{card}}(x) & \|x\|_\infty \leq 1 \\ +\infty &\|x\|_\infty > 1 \end{cases}, \qquad
\bar{g}(X) = \begin{cases} \mathop{\textbf{rank}}(x) & \|X\|_2 \leq 1 \\ +\infty &\|X\|_2 > 1 \end{cases}$$
have non-trivial convex envelopes.
As for why one would want these convex envelopes, it is because they provide some theoretical justification for why $\bar{f}^{**}(x)=\|x\|_1$ and $\bar{g}^{**}(X)=\|X\|_*$ are effective convex proxies for their non-convex counterparts. As you know, Ms. Fazel's thesis is entitled Matrix Rank Minimization with Applications, and it makes heavy use of trace minimization and nuclear norm minimization to find low-rank matrices that satisfy the modeling conditions. Section 5.1 is devoted to providing justifications for the convex heuristics that she uses throughout the work.
Added to clarify: you have expressed an interest in a comment above in finding the "tightest" convex envelope of a function. It's very important to note that if you truly want the tightest convex envelope for the entire function, you cannot restrict the domain in any way before taking the double conjugate. When you impose a domain restriction, you are computing the convex envelope of a different function. It will no longer serve as a lower bound for the original function. It will, however, be a tighter envelope over the domain you have selected.
You can see this for yourself: look at $f(x)=\mathop{\textbf{card}}(x)$ and $g(x)=\|x\|_1$. Clearly, there are values of $x$ for which $f(x)<g(x)$. So $g$ cannot be the convex envelope for $f$. It is, however, the convex envelope if you restrict the domain of $f$ to $\{x\,|\,\|x\|_\infty \leq 1\}$ (as we did with $\bar{f}$ above).
$$\sup_{\frac{x}{t}\in dom~f\\ t>0}(t(\dfrac{y^Tx}{t}+s-f(\dfrac{x}{t}))) = \sup\{tv \mid v =\sup_{u\in dom~f}((y^Tu+s-f(u)))\text{ and }\color{red}{t > 0}\}$$
Edit to address a question in the comments
What does $$\sup_{\frac{x}{t}\in dom~f\\ t>0}(t(\dfrac{y^Tx}{t}+s-f(\dfrac{x}{t})))$$ mean?
It means we form a set $$A =\left\{t(\dfrac{y^Tx}{t}+s-f(\dfrac{x}{t}))~ \right|~\left. t > 0, \frac xt \in dom~f\right\}$$
Then we find its supremum.
But that is a bit ungainly. After all, everywhere that $x$ occurs in the definition of $A$, it is as a part of $x/t$. So why not just use that value directly instead of expressing it in terms of this other arbitrary value $x$? So I replaced $x/t$ with $u$ to get
$$A = \left\{t(y^Tu+s-f(u))\mid t > 0, u \in dom~f\right\}$$
For the next step, I just made use of this equivalence:
$$\sup\{\phi(r,s)\mid r\in A, s\in B\} = \sup\{\sup\{\phi(r,s)\mid s\in B\}\mid r \in A\}$$
where $\phi : A\times B \to \Bbb R$. This is easy enough to see: For any value $p$ in the LH set, $p = \phi(r,s)$ for some $r,s$, and so $p\le \sup\{\phi(r,s)\mid s\in B\} \le \sup\{\sup\{\phi(r,s)\mid s\in B\}\mid r \in A\}$. Thus the LH side is $\le$ the RH side. Conversely, For any value $h$ less than the RH side, there must be an $r \in A$ with $\sup\{\phi(r,s)\mid s\in B\} > h$, which in turn means there is an $s \in B$ with $\phi(r,s) > h$, which means that the LH side is also $> h$. Hence they must be equal.
Of course this leaves you with a problem, which wasn't apparent in my original answer, because I wrote down the wrong thing. You will note that originally I had $tv \in dom~f$. I had meant this to be $t > 0$ (which correction is now highlighted in red, if your browser shows colored mathjax) but in my various editings of the form, I had accidently left in the wrong thing and deleted the correct one.
But in this form, it is quite clear. Since $t$ can be arbirarily large, the common value of these two expressions is $\infty$ (assuming that $f^*$ has at least one positive value). Once it is made obvious by my version, you can check that it holds in your version as well. Pick an arbitrary $u \in dom~f$ with $f(u) > 0$ and for any $t > 0$, you can take $x = tu$ to make the supremum as large as you like.
There are two possibilities here. Either the point of this exercise is for you to discover the complex conjugate of a perspective function is always infinite, or else, by conjugate of the perspective function, they only mean to conjugate with respect to constant $t$:
$$ g^*(y,t)=\sup_{\frac{x}{t}\in dom~f}(t(\dfrac{y^Tx}{t}+s-f(\dfrac{x}{t})))$$
I can't say which is the case for your course.
Best Answer
Proof. Let $ C$ be a closed convex nonempty subset of a Hilbert space $\mathcal H$. For any $x \in \mathcal H$ One computes $$i_C^*(x) := \sup_{y \in \mathcal H}x^Ty - i_C(y) = \sup_{y \in C}x^Ty,$$ which is precisely the support function $\sigma_C$ of $C$ evaluated at $x$. Finally, $\sigma_C^* = i_C^{**} = i_C$, since $i_C$ is a is convex lower semi-continuous function. $\quad\quad\quad\quad\quad\quad\quad\quad\quad\quad\quad\quad\quad\quad\quad \Box$
Update
As bonus, OP requests a proof for the fact that
This result is well-known to game-theorists, and should have been first stated and proved by Nash, if I'm not mistaken. It says that
Proof. Let's show that if $y^* \in \Delta_n$ maximizes $x^Ty$ then so does every vertex in its support. For this, it suffices to show that $x_i = x_j$ for all $i,j \in \operatorname{supp}(y^*)$. Indeed, by way of contradiction, suppose $x_i > x_j$ for some $i,j \in \operatorname{supp}(y^*)$. Then $x^Ty^* > x^Ty^*(j \rightarrow i)$ where $y^*(i \rightarrow j) \in \Delta_n$ is formed from $y^*$ by replacing the $j$th coordinate $y^*_j$ with $y^*_i + y^*_j$ and the $i$th coordinate with $0$. But this contradicts the optimality of $y^*$. $\quad\quad\quad\Box$