To answer your first question: Bounded operators over a Banach space fall in the category of Banach algebras, i.e. they have a "multiplication", which is simply the composition of operators. In any algebra that has a unit (this can be generalized to algebras that do not admit a unit, but that's partly irrelevant), one can define the spectrum of an element as
$$\sigma(a)=\{\lambda\in\mathbb{C}: \lambda1_A-a\text{ is not invertible in } A\} $$
This can be done in any algebra. Why is this set interesting? Note for example that matrices fall in the category of Banach algebras as well, and, elementary linear algebra (or compact operator theory) yields that the spectrum of a matrix is equal to the set of its eigenvalues. Also, note that the space $C(X)$ of continuous functions over a compact Hausdorff space is also a Banach algebra, and here the spectrum of a function is its image. So the spectrum seems to unify important characteristics of elements of algebras in one notion.
Note that noone assures us that in an arbitrary algebra the spectrum is non-empty. A very important result by Gelfand is that in Banach algebras, the spectrum is always non-empty. So it always makes sense and is indeed interesting to know the quantity $\max|\lambda|$, which is precisely the spectral radius. Also the spectrum is compact (this is relatively easy) and it is contained in the closed disk $D(0,\|a\|)\subset\mathbb{C}$.
Now the question of interest: how do we estimate the spectral radius of an element? As said, a first estimate is $r(a)\leq\|a\|$.
If $c_0+c_1z+\dots+c_nz^n=p(z)\in\mathbb{C}[z]$ is a polynomial and $a\in A$ is an element of a unital Banach algebra, set $p(a):=c_01_A+c_1a+\dots+c_na^n$. Using the Fundamental theorem of algebra and the fact that two commuting elements are invertible iff their product is invertible, one gets the interesting equation $\sigma(p(a))=p(\sigma(a))$, that is the image of $\sigma(a)$ through $p(z)$. Observe now that if $\lambda\in\sigma(a)$ and $n\in\mathbb{N}$ we have $\lambda^n\in\sigma(a^n)$, therefore $|\lambda^n|\leq r(a^n)\leq\|a^n\|$. Thus $|\lambda|\leq \|a^n\|^{1/n}$. Taking supremum as $\lambda$ ranges over $\sigma(a)$ yields $r(a)\leq\liminf_{n\to\infty}\|a^n\|^{1/n}$. I believe this is enough to show why one would think that this limit exists and why it should be equal to $r(a)$: people probably couldn't find any example to counter this guess (that is reasonable due to this estimate), until Gelfand and Beurling proved this formula:
$$r(a)=\lim_{n\to\infty}\|a^n\|^{1/n}.$$
Hope this helps.
The statement that you have made is not true. For example, consider the matrix
$$
X = \pmatrix{10 & 100\\1 & 10}.
$$
Its eigenvalues are $0,20$, which means that $\rho(X) = 20$. On the other hand, we find that $\|X\|_2 = 101 > \rho(X)$.
Best Answer
The norm of a matrix is defined as \begin{equation} \|A\| = \sup_{\|u\| = 1} \|Au\| \end{equation} Taking the singular value decomposition of the matrix $A$, we have \begin{equation} A = VD W^T \end{equation} where $V$ and $W$ are orthonormal and $D$ is a diagonal matrix. Since $V$ and $W$ are orthonormal, we have $\|V\| = 1$ and $\|W\| = 1$. Then $\|Av\| = \|D v\|$ for any vector $v$. Then we can maximize the norm of $Av$ by maximizing the norm of $Dv$.
By the definition of singular value decomposition, $D$ will have the singular values of $A$ on its main diagonal and will have zeros everywhere else. Let $\lambda_1, \ldots, \lambda_n$ denote these diagonal entries so that
\begin{equation} D = \left(\begin{array}{cccc} \lambda_1 & 0 & \ldots & 0 \\ 0 & \lambda_2 & \ldots & 0 \\ \vdots & & \ddots & \vdots \\ 0 & 0 & \ldots & \lambda_n \end{array}\right) \end{equation}
Taking some $v = (v_1, v_2, \ldots, v_n)^T$, the product $Dv$ takes the form \begin{equation} Dv = \left(\begin{array}{c} \lambda_1v_1 \\ \vdots \\ \lambda_nv_n \end{array}\right) \end{equation} Maximizing the norm of this is the same as maximizing the norm squared. Then we are trying to maximize the sum \begin{equation} S = \sum_{i=1}^{n} \lambda_i^2v_i^2 \end{equation} under the constraint that $v$ is a unit vector (i.e., $\sum_i v_i^2 = 1$). The maximum is attained by finding the largest $\lambda_i^2$ and setting its corresponding $v_i$ to $1$ and then setting each other $v_j$ to $0$. Then the maximum of $S$ (which is the norm squared) is the square of the absolutely largest eigenvalue of $A$. Taking the square root, we get the absolutely largest eigenvalue of $A$.