Let's assume the constraint functions $g_1, \ldots, g_m$ are smooth ($C^\infty$), though this can be relaxed.
The rows of the Jacobian $J_g(X)$ are the gradients of the constraints at $X$; let me write them as $\nabla g_1(X), \ldots, \nabla g_m(X)$.
If these are linearly independent (that is, if the Jacobian has full row rank), then they remain linearly independent in a neighborhood of $X$. You can see this by considering the determinant of $J_g(X) J_g(X)^\top$: it is nonzero at $X$, and it is continuous as a function of $X$; therefore, it remains nonzero in a neighborhood of $X$.
It is a fact from differential geometry that---under the conditions described above---the set of solutions to the constraints (locally) is a smooth manifold, that is, a smooth surface: at $X$, this surface has a well-defined tangent space (a linearization) and a normal space (the orthogonal complement of the tangent space with respect to some inner product).
The normal space is spanned exactly by the gradients of the constraints. In other words: vectors of the form $\sum_{j=1}^m \lambda_j \nabla g_j(X)$ are vectors normal to the surface, and all vectors normal to the surface can be written (uniquely) in that way.
This gives a nice interpretation to the Lagrangian formalism: $X$ is stationary exactly if the gradient of $f$ at $X$ is in the normal space to the surface. Intuitively, this should make sense.
When the gradients are not linearly independent (that is, when the Jacobian does not have full row rank), the above discussion does not go through: we cannot be certain that the search space (the set of solutions of the constraints) looks like a smooth surface around $X$. It might look weird (have kinks, angles, cusps, ...). It is then much harder to say precisely what is going on. The simplest results from typical introductory courses in optimization simply avoid those complications by assuming linear independence of the gradients. That is not to say that things necessarily go bad when you don't have linear independence, but they could; and at any rate, the situation requires more care.
Best Answer
Since the image of $A^{k+1}=A^k\,A$ is included in the image of $A^k$, we have that the function $k\longmapsto \text{rank}(A^k)$ is non-increasing. As the only possible values are $\{1,\ldots,n\}$ the numbers $\text{rank}(A^k)$ have to stabilize in a finite number of steps.
If the rank of $A^{k+1}$ is equal to the rank of $A^k$, this forces the image of $A^{k+1}$ to be equal to that of $A^k$, and thus the rank stabilizes.
So, the possible ranks will be a strictly decreasing sequence of positive integers until it stabilizes.
Here is an example that shows that one can start at any rank $h$, and decrease until rank $m$. Let $B$ be nonsingular $m\times m$ and $C$ an $(h-m+1)\times (h-m+1)$ Jordan block with zero diagonal. Consider $$ A=\begin{bmatrix}B&0\\0& C\end{bmatrix}. $$ Then $$\text{rank}(A^k)=\begin{cases}h-k,&\ k\leq \ h-m\\ m,&\ k>r\end{cases}$$