Chapter 1: The Endless search
I know that it's really difficult to visualise infinite-dimensional cases, but let's make some guided tour into the beautiful infinite-dimensional world.
Firstly, let's try to understand, what obstacles we will encounter. The main problem is the Riesz's lemma and its corollary: an infinite-dimensional unit sphere is not compact. I'll give the proof further, just because it's very instructive.
Before the proof, we're going to visualise the process of search for the value of $d(0, y+Y)$, where $Y$ is an arbitrary closed vector subspace of $X$, and $y \in X \setminus Y$ (to exclude the trivial case). Any closed subspace always corresponds to the kernel of some linear operator. In particular, hyperplanes are obtained from kernels of linear functionals by translations.
Denote $S_X$ a unit sphere $\{x \in X \mid ||x||=1\}$ centered at zero, and by $S_Y$ the intersection $S_X \cap Y$. Equip both $S_X$ and $S_Y$ with topologies, induced by $||\cdot||$.
Define a function $R : S_Y\times \mathbb{F} \to \mathbb R$ as $R(s, t) = ||ts + y||$. $R$ is obviously continuous. Hence $r(s) = \min_{t \in \mathbb F} R(s,t): S_Y \to \mathbb R$ is also continuous. It's OK to write minimum, since $||ts + y|| \to \infty$ as $|t| \to \infty$, and hence we can always consider a compact subset $K$ of $\mathbb F$ s.t. $\forall t \in K$ $||ts + y|| \leq ||y||$.
Now, as we have our main tool $r$, let's look at what we've created. The function $r$ takes a unit vector $s \in Y$ pointing out of $y$ and gives us the minimal distance between the origin and our path as we go along $s$ (this analogy is more convenient in case $\mathbb F = \mathbb R$, but I hope everyone understands what's going on).
Thus, in order to find $d(0, y+Y)$, one has to look for $\inf_{s \in S_Y} r(s) = d(0, y+Y)$. It exists, since $r$ is non-negative by the definition. But it may or may not be attained, since $S_Y$ is always compact, when $\dim Y \leq \infty$, and is never compact otherwise.
What happens when the infinum is not attained? The answer is contained in the proof of the Riesz's lemma and its corollary.
Let $X$ be a normed space, $Y \subset X$ be a vector subspace, $\varepsilon > 0$. A vector $h \in X$ is called an $\varepsilon$-perpendicular to $X_0$ iff $||h||=1$ and $d(h, Y) \geq 1 - \varepsilon$. It is denoted $h \perp_\varepsilon Y$.
In a finite-dimensional space, and even in a Hilbert space, $0$-perpendiculars, or just perpendiculars, exist for any closed subspace (any finite-dimensional subspace is always closed, by the way). The algorithm to find them is well-known. But in an arbitrary normed space $0$-perpendiculars may not exist even for some closed subspaces, as we'll see in the third chapter. However,
For a closed vector subspace $Y$ of a normed space $X$ and an arbitrary $\varepsilon > 0$ there exists an $\varepsilon$-perpendicular to $Y$.
Proof
Take an arbitrary $x \in X \setminus Y$. Denote $d = d(x,Y) > 0$.
$\forall \delta > 0$ $\exists x_\delta \in Y$ s.t. $d(x, x_\delta) \leq d + \delta$ (this follows from the definition of $d(x,Y)$). Define
$$
h_\delta := \frac{x-x_\delta}{d(x,x_\delta)}.
$$
$\forall y \in Y$ we have
$$
d(h_\delta, y) = \left|\left| \frac{x-x_\delta}{d(x,x_\delta)} - y\right|\right| = \frac{||x - (x_\delta + d(x,x_\delta) y)||}{d(x,x_\delta)} \geq \frac{||x - z||}{d + \delta} \geq \frac{d}{d + \delta},
$$
where $z = x_\delta + d(x,x_\delta) y \in Y$. Hence
$$
d(h_\delta, y) \geq \frac{d}{d + \delta}, \to 1
$$
as $\delta \to 0$. Q.E.D.
Let $X$ be an infinite-dimensional vector space. Then $S_X$ is not compact.
Proof
There exists a chain of vector subspaces
$$
0 = X_0 \subset X_1 \subset X_2 \subset \dots
$$
in our space $X$ s.t. $\forall n \in \mathbb N$ $\dim X_n = n$. Otherwise $X$ is finite-dimensional.
Notice, that $\forall n \in \mathbb N$ $X_n$ is automatically closed. Hence $\forall n \in \mathbb N$ $\exists h_n \in X_{n+1}$ s.t. $h_n \perp_{\frac 12} X_n$. Hence $\forall n \in \mathbb N$ $h_n \in S_X$ and $\forall i \neq j \in \mathbb N$ $d(h_i, h_j) \geq \frac 12$. Q.E.D.
The last construction looks like an infinite staircase spiralling around the unit sphere. We may go down the stairs endlessly in search of a $0$-perpendicular, but we may never find it. On the other hand, we can get as close to it, as we want. Such things happen every now and then, especially in functional analysis.
Chapter 2: Between two plains
Let's go back to hyperplanes. Suppose there exists a perpendicular $h$ to $H_1$. Then $d(0, H_1) = |d|$, where $d \in \mathbb F$ is such number that $dh \in H_1$. In other words, $d(0, H_1)$ is the usual distance on $\mathrm{span}(h)$ (which is $1$-dimensional) between the origin and the intersection point $\mathrm{span}(h) \cap H_1$. Notice also, that $d(0, H_1)$ is the distance between two parallel plains $H_1$ and $\ker f$, and that's exactly the length of the "segment" between them on any line ($1$-dimesional subspace) perpendicular to them.
However, this is not the general case, as we already know. In general, we have only $\varepsilon$-perpendiculars for $\varepsilon > 0$. But we can take $h_\varepsilon \perp_\varepsilon H_1$ and consider $d(0, K_\varepsilon)$, where $K_\varepsilon = \mathrm{span}(h_\varepsilon) \cap H_1$ is bounded and closed, hence compact, subset of $\mathrm{span}(h_\varepsilon) \simeq \mathbb F$. So $d(0, K_\varepsilon) = \min_{x \in K_\varepsilon} ||x||$. Actually, we don't even need to take any particular point of $K_\varepsilon$, since the diameter of $K_\varepsilon$ tends to $0$ as $\varepsilon \to 0$, but I won't prove this :) Thus, sometimes we are not able to visualise $||f||$, but we always can get as close to our dream as we want.
The next result formalises the two preceding paragraphs.
Let $X$ be a normed space, let $f \in X^* / {0}$, and let $X_0 = \ker f$. Then there exists a $0$-perpendicular to $X_0$ in $X$ if and only if $f$ is norm-attaining.
Proof
Further we are going to assume $||f||=1$ without lost of generality (take $||f||^{-1}f$ otherwise).
Let $x$ be a $0$-perpendicular to $X_0$ and $f(x) > 0$. Then $\forall y \in X$ $\exists x_0 \in X_0$ such that $y = \frac{f(y)}{f(x)}x + x_0$. So $\forall y \in S_X$
$$
1 = ||y|| = \left|\left|\frac{f(y)}{f(x)}x + x_0\right|\right| \geq \left|\frac{f(y)}{f(x)}\right| ||x|| = \left|\frac{f(y)}{f(x)}\right|.
$$
Hence $\forall y \in S_X$
$$
f(x) \geq |f(y)|,
$$
so
$$
1 = ||f|| = f(x).
$$
Let $f$ attain its norm on $x \in S_X$. Then $\forall x_0 \in X_0$
$$
||x - x_0|| = ||f|| ||x - x_0|| \geq |f(x - x_0)| = |f(x)| = 1.
$$
Since $1 = ||x|| = ||x - 0||$ ($0 \in X_0$), $x$ is a $0$-perpendicular to $X_0$. Q.E.D.
Moreover, both norm-attaining and not norm-attaining types of functionals can be easily encountered. Norm attaining functionals are the only habitants in duals of finite-dimensional spaces. The infinite-dimensional case provides more diversity, as usual.
There exists a functional which does not attain its norm.
Proof
Indeed, consider a sequence of numbers
$$
\alpha =(\alpha_n)_{n \in \mathbb N} = \left(1 - \dfrac{1}{n}\right)_{n \in \mathbb N}
$$
and the corresponding linear functional $f_\alpha : (c_{00}, ||\cdot||_1) \to \mathbb{C}$:
$$
f_\alpha (x) = \sum_{n = 1}^\infty \alpha_n x_n.
$$
This definition is obviously correct. It's also clear that $||f_\alpha|| = 1$, but $f_\alpha$ is not norm-attaining. Q.E.D.
Conclusion
Sometimes we don't get exactly what we want, but, I believe, we get something even better in this case. Otherwise math would be much more boring.
I hope I gave you some good insight. Anyway, I was happy to help and I'm always ready to answer your questions and to improve my post.
Best Answer
One way is the total number of leaves of a (single) rooted tree in which each leaf is minimally linked to the root by exactly $n-1$ edges, and which has the following property: the root has $2$ children, each child of the root has $3$ children, each child of each child of the root has $4$ children, and so on until the leaves are reached. A natural term for this is factorial tree, but I don't know if this phrase is in general use for this notion.
For example, for $n = 4$: