Typically you'll use the set $X$ to represent black-box constraints, e.g., constraints for which you don't have an analytical representation. They could consist in the output of a computer code that returns True if the constraints are satisfied and False otherwise. In general, if you have analytical descriptions of the constraints, it is to your advantage to use them. There is research on mixed black-box optimization, where the problem has a mixture of black-box constraints and explicit constraints, but you wouldn't get the most out of your problem in terms of efficiency if you misclassified constraints as black-box constraints.
As to the transformation of and equality into two inequalities, it will cause most algorithms for smooth optimization to break down (assuming $h$ is smooth). It's easy to see why: most methods will aim to satisfy the KKT conditions (first-order optimality). However the KKT conditions are necessary for optimality IF a constraint qualification is satisfied. A constraint qualification is a certificate that the analytical expression of the feasible set is, in some sense, not redundant in the description of its geometry.
Consider for instance the constraints $x^3 \leq 1$ and $x^3 \geq 1$. We clearly see some redundancy here. Why not just say $x^3 = 1$? Why not just say $x=1$???
The redundancy manifests itself in the constraint qualification. The most widely used constraint qualification condition is the linear independence constraint qualification condition (LICQ). It requires that the gradients of all constraints that are satisfied as an equality at a solution be linearly independent. This concerns of course all equality constraints but also all inequalities that are "active" (i.e. $g_i(x) = 0$ in your notation). Since $h$ was an equality constraints, we'll necessarily have $h(x) = 0$ at a solution, but the gradients of the functions $h$ and $-h$ cannot be linearly independent. Try solving a problem with $x^3 \leq 1$ and $x^3 \geq 1$ on one of the NEOS solvers and you'll observe trouble: http://neos-server.org/neos
You'll run into the same kind of trouble if one of the active constraint gradients vanishes at a solution. Consider for instance the constraint $x^2 = 0$. When problems are automatically generated by some procedure, it's very hard to check for such redundancy, but when you model a problem by hand from scratch, it's important to look out for it.
Coming back to the first topic, there is a (theoretical) advantage to describing any feasible set as just $X \subseteq \mathbb{R}^n$. It is that the first-order optimality conditions are simply that $-\nabla f(x)$ must lie in the normal cone to $X$ at $x$. And this doesn't depend on any constraint qualification because this is a statement that only concerns the geometry of the feasible set, not its analytical description.
For more information, take a look at the book by Bazaraa, Sherali and Shetty, or "Numerical Optimization" by Nocedal and Wright (Springer).
Convenient conditions for checking that a certain stationary point is optimal generally require that the objective function is convex. Nonconvex optimization is a much more difficult subject with a lot of special cases for different problems. Two of the many reasons for this are that there can be stationary points that are not local minima and that there can be multiple local minima.
On a simpler level, notice that the Lagrange multipliers condition is a first order condition. You may recall the second order condition from calculus, which checks the sign of the second derivative at the critical point. This condition is sufficient, but not necessary: when the second derivative is positive you have a minimum, when it is negative you have a maximum, and when it is zero you get no information.
This condition generalizes to multivariable problems through the Hessian. Cases:
- The Hessian is positive definite (all eigenvalues are strictly positive) $\Rightarrow$ minimum.
- The Hessian is negative definite (all eigenvalues are strictly negative) $\Rightarrow$ maximum.
- The Hessian is indefinite (there are some strictly positive eigenvalues and some strictly negative eigenvalues) $\Rightarrow$ saddle point. Under normal circumstances this is neither a maximum nor a minimum, because we can follow the eigenvector with a negative eigenvalue to go down and can follow the eigenvector with a positive eigenvalue to go up.
- The Hessian is positive (resp. negative) semidefinite (all eigenvalues are nonnegative (resp. nonpositive) but some are zero). Here you may have a minimum (resp. maximum) but may also have nothing.
Of course, everything from the previous paragraph assumes you have enough smoothness that all of this makes sense, and there are certainly nonsmooth optimization problems. These problems have their own special issues. The first place to look for dealing with these kinds of problems is convex analysis. My knowledge of convex analysis is quite limited, so I will not try to discuss it here.
Best Answer
If you remember back to Calculus I, when you first learned about optimization, what was the procedure.
1) Compute the derivative
2) Find points where the derivative is 0 (critical points).
3) Evaluate the function at these points and the endpoints of the region.
In most cases (continuously differentiable functions) this process was guaranteed to work, meaning one of those points was the minimum and one was the maximum. In this case checking the endpoints was the way of dealing with the fact that the optimization problem was constrained.
With higher dimensional functions and more complex boundaries, this problem becomes harder. Generally speaking, we still need to identify points satisfying first order conditions inside the region, and points satisfying modified (see KKT conditions) first order conditions on the boundary of the region.