$A$ has the same eigenvalues as its transpose, that I will denote $B$. For $B$, the hypothesis means that $\sum_{j=1}^n|a_{ij}|\leq 1$ for all $i$. If $x\neq 0$ is an eigenvector for $\lambda$, and $i$ is such that $x_i=\lVert x\rVert_{\infty}$, then we have $\sum_{j=1}^na_{ij}x_j=\lambda x_i$ hence $\sum_{j\neq i}a_{ij}x_j=(\lambda-a_{ii})x_i$ and
$$\lVert x\rVert_{\infty}|\lambda-a_{ii}|\leq \sum_{j\neq i}|a_{ij}|\cdot |x_j|\leq \sum_{j\neq i}|a_{ij}|\cdot\lVert x\rVert_{\infty}\leq (1-|a_{ii}|)\cdot\lVert x\rVert_{\infty}.$$
As $x\neq 0$, we get $|\lambda-a_{ii}|\leq 1-|a_{ii}|$ hence $|\lambda|\leq 1$.
(we proof the Gershgorin circle theorem)
I don't have an answer to your question, but here are some thoughts.
If your normal distribution is able to generate a $B$ whose spectral radius is $\ge1$, then any procedure to modify $B$ to make its spectral radius less than one would clearly result in non-conformance to the normal distribution. So, I assume that you want to modify $B$ such that it follows a conditional distribution of normal distribution.
This seems to me a difficult problem, but the difficulty goes beyond that. In fact, when $p$ and $M$ are large, any such procedure would necessarily mean that you are drawing samples from the tails of normal distribution, i.e. you are mostly using extreme events as samples.
Here is an explanation. Let $v=(v_1^T,v_2^T,\ldots,v_n^T)^T$ be an eigenvector of $B$. If $v$ corresponds to the zero eigenvalue, then the equation $Bv=0$ means
$$
\begin{pmatrix}
A_1 & A_2 & \ldots & A_{p-1} & A_p\\
I & I & \ldots & I & 0
\end{pmatrix}v=0
$$
The rank of the matrix on the LHS is almost surely $2M$, so its null space is $(p-2)M$-dimensional and $B$ almost surely has $2M$ nonzero eigenvalues.
Now, suppose $\lambda$ is a nonzero eigenvalue of $B$. From $Bv=\lambda v$, we get
\begin{align}
A_1v_1 + \sum_{i=2}^p A_pv_p &= \lambda v_1,\\
v_1 + v_2+v_3+\ldots+v_{p-1} &= \lambda v_2,\\
v_1 + v_2+v_3+\ldots+v_{p-1} &= \lambda v_3,\\
&\vdots\\
v_1 + v_2+v_3+\ldots+v_{p-1} &= \lambda v_p.
\end{align}
In other words, $v_2=v_3=\ldots=v_p=$ some vector $u$, $v_1 = [\lambda-(p-2)]u$ and
\begin{align}
A_1[\lambda-(p-2)]u + \sum_{i=2}^p A_pu = \lambda [\lambda-(p-2)]u,\\
\textrm{i.e. }\quad
C_\lambda u:=\left(\lambda^2 I - \lambda [(p-2) I + A_1] + \left[(p-2)A_1 - \sum_{i=2}^p A_p\right]\right)u = 0.
\end{align}
Let $q(\lambda) = \det\,C_\lambda$. Then $q(\lambda)=0$ is a degree-$2M$ polynomial equation in $\lambda$. The sum $S$ of its $2M$ roots is the coefficient of $\lambda^{2M-1}$ in $q(\lambda)$, i.e. $S = \mathrm{trace}\left((p-2) I + A_1\right) = M(p-2)+\mathrm{trace}\,A_1$. In order that these $2M$ roots all have moduli smaller than $1$, $|S|$ must be smaller than $2M$. Hence $\left|(p-2)+\frac{\mathrm{trace}\,A_1}{M}\right|<2$. When $M$ is large, however $\frac{\mathrm{trace}\,A_1}{M}$ will be distributed around the mean $\mu$ of the normal distribution, with variance of order $O(1/M)$. Since $\left|(p-2)+\mu\right|\ge2$ when $p$ is large, we conclude that, when both $M$ and $p$ are large, the condition $|S|<2$ is satisfied only when $\frac{\mathrm{trace}\,A_1}{M}$ is an extreme value.
Best Answer
Instead of summing the squares of elements in a column or row, sum the absolute values of the elements in a row. if this is less than 1 for each row, you have it. Same for columns. these correspond to induced norms,