Hint: the basic assumption is that the Lagrangian $L(q(x),q'(x),x)$ is differentiable at $y:=(q(x),q'(x),x)$ and the variation $\delta L$ expresses the differentiability, in a suitable limit.
In other words, we are considering the increment ( $h$ is just a vector in $\mathbb R^3$, for all $x$, but the components $\delta q$ and $\delta q'$ are functions of $x$)
$$h:=(\delta q,\delta q',0)$$
and looking at
$$L(y+h)-L(y)=\langle \nabla L(y),h\rangle+O(\|h\|);~~(*)$$
the gradient $\nabla L(y)$ of $L$ at $y$ is
$$\nabla L(y)=\left(\frac{\partial L}{\partial q},\frac{\partial L}{\partial q'},\frac{\partial L}{\partial x}\right) $$
and, by definition,
$$y+h=(q(x)+\delta q,q'(x)+\delta q',x).$$
- Remark: boundary conditions
In variational problems, the vector $h$ is not completely arbitrary: in fact, if we consider small variations _(i.e. $\delta q$) of the function $q:x\mapsto q(x)$ we need to impose boundary conditions on the components of $h$: such conditions are depending on the specific variational problem under exam. They are motivated by the variational approach and usually "kill" unfriendly boundary terms (at least in classical variational problems: in quantum field theory one can accept and subsequently work with boundary terms).
For example, in presence of the variational problem
$$S[q]:=\int_a^b L(q(x),q'(x),x)dx $$
we would be interested in all small variations $\delta q$ of $q=q(x)$ s.t. $\delta q(a)=\delta q(b)=0$.
If we expand $(*)$ using the definition of gradient and scalar product, we arrive at
$$L(q(x)+\delta q,q'(x)+\delta q',x)-L(q(x),q'(x),x)=\frac{\partial L}{\partial q}\delta q+\frac{\partial L}{\partial q'}\delta q'+O(\|h\|). $$
$$\delta q:=\epsilon\varphi, $$
$$\delta q':=\epsilon\varphi' $$
where $\varphi$ is any function satisfying suitable boundary conditions, and $\epsilon$ is a parameter. In this setting we are considering the increment
$$h=(\epsilon\varphi,\epsilon\varphi',0)$$
around $y=(q(x),q'(x),x)$ and the first variation $\delta L$ of $L$ is defined as
$$\delta L:=\frac{dL}{d\epsilon}|_{\epsilon=0}:=\lim_{\epsilon\rightarrow 0}
\frac{L(q(x)+\epsilon\varphi,q'(x)+\epsilon\varphi',x)-L(q(x),q'(x),x)-\epsilon\left(\frac{\partial L}{\partial q}\varphi+\frac{\partial L}{\partial q'}\varphi'\right)+O(\epsilon^2)}{\epsilon}. $$
I personally find this notation quite good.
The misconception OP is having is the approach to the Hamiltonian method.
The Hamilton equation $\dot{p_i} = - \frac{\partial H}{\partial q_i}$ does not apply to the "ordinary momentum" $p_i$. It instead applies to the canonically conjugate momentum, which is defined as $p_i = \frac{\partial L}{\partial \dot{q_i}}$.
In simple cases, the part of the Lagrangian that depends on $\dot{q}_i$ is $\frac{1}{2} m_i \dot{q_i}^2$, so the canonically conjugate momentum is indeed $m q_i$. In this case, however, the canonically conjugate momenta are
$p_1 = M \dot{q_1} + m \tan^2\theta (\dot{q_1} + \dot{q_2})$
$p_2 = m (\dot{q_2} + \tan^2 \theta (\dot{q_1} + \dot{q_2}))$
As you can see, there is some annoying entanglement going on here between the two variables. This is reflected by the fact that energy depends on $(q_1 + q_2)^2$, in part.
Best Answer
Assume that the functional (a function/map/operator that has a function as input and a number as output) is given as
$$F[ρ]=\int_\Omega f\Bigl(x,ρ(x),∇ρ(x)\Bigr)\,dx.$$
For the variation in directional-derivative form you then get \begin{align} δF[ρ;ψ]&=\lim_{s\to0}\frac{F[ρ+sψ]-F[ρ]}{s} \\[.3em]&=\int_\Omega \left[ \frac{∂f}{∂ρ}\Bigl(x,ρ(x),\nabla ρ(x)\Bigr)ψ(x)+\frac{∂f}{∂∇ρ}\Bigl(x,ρ(x),\nabla ρ(x)\Bigr)∇ψ(x)\right]\,dx. \end{align} Here $∂ρ$ and $∂∇ρ$ in the partial-derivative symbol just indicate the place in the arguments of $f$ that the partial derivative is taken for, there is no other special magic involved. If one writes $f(x,y,v)$ as the "canonical" arguments of $f$, the partial derivatives could also be written as $\frac{∂f}{∂y}$ and $\frac{∂f}{∂v}$.
One could use $δρ$ as function symbol instead of $ψ$, in a slight abuse of notation $∇δρ$ might then be written as $δ∇ρ$. Or you might declare $δρ(x)[ψ]=ψ(x)$ as the evaluation operator for the derivative direction function $ψ$.
The variation in differential-quotient form $\frac{δF[ρ]}{δρ(x)}$ can now be defined as some limit over the support of $ψ$ shrinking to the point $x$, or more systematically over the demand that $$ δF[ρ;ψ]=\int_\Omega \frac{δF[ρ]}{δρ(x)}δρ(x)[ψ]\,dx+\int_{∂\Omega}... $$ To that end one would have to get rid of the derivative terms $∇ψ$. That is achieved using some variation of the fundamental theorem of calculus, generally called Stokes theorem for differential forms (Greene, Gauss,...). This gives in interior points $$ \frac{δF}{δρ(x)}[ρ]=\frac{∂f}{∂ρ}\Bigl(x,ρ(x),\nabla ρ(x)\Bigr)-∇^*\frac{∂f}{∂∇ρ}\Bigl(x,ρ(x),\nabla ρ(x)\Bigr). $$