This type of problem is generally referred to as constrained optimization. A general technique to solve many of these types of problems is known as the method of Lagrange multipliers, here is an example of such a problem using Lagrange multipliers and a short justification as to why the technique works.
Consider the parabaloid given by $f(x,y) = x^2 + y^2$. The global minimum of this surface lies at the origin (at $x=0$, $y=0$). If we are given the constraint, a requirement on the relationship between $x$ and $y$, that $3x+y=6$, then the origin can no longer be our solution (since $3\cdot 0 + 1 \cdot 0 \neq 6$). Yet, there is a lowest point on this function satisfying the given constraint.
What we have so far:
Objective function: $f(x,y) = x^2 + y^2$,
subject to: $3x+y=6$.
From here we can derive the Lagrange formulation of our constrained minimization problem. This will be a function $L$ of $x$, $y$, and a single Lagrange multiplier $\lambda$ (since we have only a single constraint). It will be this new function that we minimize.
$L(x,y,\lambda) = x^2 + y^2 + \lambda(3x+y-6)$
The Lagrange formulation incorporates our original function along with our constraint(s). On the way toward minimizing $L$, we will have to minimize the objective function $x^2 + y^2$, as well as minimize the contribution from the constraint, which is now weighted by a factor of $\lambda$. If the constraint is met, then the expression $3x+y-6$ will necessarily be zero, and will not contribute anything the value of $L$. This is the trick of the technique.
Minimizing the Lagrange formulation:
To minimize $L$ we simply find the $x,y$, and $\lambda$ values that make its gradient zero. (This is exactly analogous to setting the first derivative to zero in calculus.)
$\nabla L = 0:$
$\frac{\partial L}{\partial x} = 2x + 3 \lambda = 0$
$\frac{\partial L}{\partial y} = 2y + \lambda = 0$
$\frac{\partial L}{\partial \lambda} = 3x + y - 6 = 0$,
In our example we have arrived at a system of simultaneous linear equations which can (and should) be solved with matrix algebra. The solution will be a vector holding values for $x, y$ and $\lambda$. The lowest value of the objective function, subject to the given constraint, sits at $(x,y,f(x,y))$, and the Lagrange multiplier does not have an immediate physical interpretation. (The multipliers have meaning when appearing in certain contexts, more info on that here.)
The geometrical picture is the following: We are asked to find the local extrema of the distance from the point $(0,b)$ on the $y$-axis to points on the parabola $y=x^2$. From looking at a figure we can guess the following: If $b\gg1$ there are two local minima high up, and a local maximum at $(0,0)$. If $0<b\ll1$ there is just one local minimum at $(0,0)$, and the same holds when $b\leq0$.
The intended computation goes as follows: Set up the Lagrangian
$$\Phi:=x^2+(y-b)^2+\lambda(y-x^2)\ ,$$
and solve the system
$$\Phi_x=2x-2\lambda x=0,\quad \Phi_y=2(y-b)+\lambda=0,\quad y=x^2\ .$$
From $x(1-\lambda)=0$ we infer (i) $x=0$ or (ii) $\lambda=1$. In case (i) we then obtain $y=0$ and a certain value of $\lambda$, and in case (ii) we obtain $y=b-{1\over2}$. The condition $y=x^2$ then implies that case (ii) only leads to real solutions if $b\geq{1\over2}$, and in this case we have $x=\pm\sqrt{b-{1\over2}}$.
It follows that Lagrange's method has confirmed our geometric analysis of the problem. Note however that it is quite cumbersome to do a second derivative test in the framework of this method. Instead we can do the following: Consider the parametric representation $x\mapsto (x,x^2)$ of the parabola, and instead of $f$ plus constraint look at the pullback
$$\psi(x):=f(x,x^2)=x^2+(x^2-b)^2\qquad(-\infty< x<\infty)\ .$$
Now analyze this function $\psi$ as a function of one variable. You will get the same results (depending on $b$) as before, and in addition the second derivative test will confirm what you knew all along. The case $b={1\over2}$ is special: Here the first nonvanishing derivative is $\psi^{(4)}(0)=24$. Since $4$ is even and $24>0$ we have a local minimum there.
Best Answer
Let us discuss the example you were given. Generally, this optimization method uses the following strategy. Let $f(x,y,z)$ be the function that we are attempting to determine the critical points for, subject to the constraint equation $$g(x,y,z)=k$$ for some $k \in \mathbb{R}$. We solve the following system: $$\nabla f(x,y,z) = \lambda \nabla g(x,y,z) \\g(x,y,z)=k$$ of four equations and four unknowns (note that $\nabla$ is the gradient function which returns the vector composed of partial derivatives with respect to $x$, $y$, and $z$). In this case, we have $f(x,y,z)=2x+y-2z$ and $g(x,y,z)=x^2+y^2+z^2=4$ (this is a sphere of radius $2$). Thus, we have the following system of equations: $$\begin{cases}2 = 2\lambda x \,\,\,\,\,\,(f_x = \lambda g_x) \\ 1 = 2\lambda y \,\,\,\,\,\, (f_y=\lambda g_y)\\ -2 = 2\lambda z \,\,\,\,\,(f_z= \lambda g_z)\\ x^2+y^2+z^2=4\end{cases}$$ There are various ways that you can solve this, but we will solve in the following way. Multiplying the first equation by $yz$, the second equation by $xz$, and the third equation by $xy$ and setting each of these equal to one another, we obtain $$2\lambda xyz = \begin{cases} 2yz \\ xz \\ -2xy \end{cases}$$ So, first we have $x = 2y$ upon dividing $2yz=xz$ by $z \neq 0$. Then we also have $z=-2y$ upon dividing $xz=-2xy$ by $x \neq 0$. Finally, we have $x=-z$ upon dividing $2yz = -2xy$ by $2y \neq 0$. Applying this, we substitute for $x$ and $z$ in terms of $y$ into the fourth equation to get $$x^2 +y^2 +z^2 =4 \implies 4y^2 + y^2 + 4y^2 = 9y^2 = 4 \implies y = \mp \frac{2}{3}$$ I will let you solve for the other $3$ unknowns (consider each case separately: assume $y = -\frac{2}{3}$ and solve for $x,z,\lambda$ and then assume $y=\frac{2}{3}$ and solve for $x,z,\lambda$). Recall from before that $z = -2y$ and $x=-z$. You will find the two solutions $$(x,y,z,\lambda)=\left(\mp \frac{4}{3},\mp \frac{2}{3}, \pm \frac{4}{3},\mp \frac{3}{4}\right) .$$ These solutions $(x,y,z)$ are the critical points of the function $f$ under this constraint $g(x,y,z)=4$ and we can use multiple ways to classify them (as, for instance, maximums, minimums, or saddle points).