Projection on Convex Sets (POCS) / Alternating Projections does exactly what you want in case your sets $ \left\{ \mathcal{C}_{i} \right\}_{i = 1}^{m} $ are sub spaces.
Namely if $ \mathcal{C} = \bigcap_{i}^{m} \mathcal{C}_{i} $ where $ {\mathcal{C}}_{i} $ is a sub space and the projection to the set is given by:
$$ \mathcal{P}_{\mathcal{C}} \left( y \right) = \arg \min_{x \in \mathcal{C}} \frac{1}{2} {\left\| x - y \right\|}_{2}^{2} $$
Then:
$$ \lim_{n \to \infty} {\left( \mathcal{P}_{\mathcal{C}_{1}} \circ \mathcal{P}_{\mathcal{C}_{2}} \circ \cdots \circ \mathcal{P}_{\mathcal{C}_{m}} \right)}^{n} \left( y \right) = \mathcal{P}_{\mathcal{C}} \left( y \right) $$
Where $ \mathcal{P}_{\mathcal{C}_{i}} \left( y \right) = \arg \min_{x \in \mathcal{C}_{i}} \frac{1}{2} {\left\| x - y \right\|}_{2}^{2} $.
In case any of the sets isn't a sub space but any convex set you need to use Dykstra's Projection Algorithm.
In the paper Ryan J. Tibshirani - Dykstra’s Algorithm, ADMM, and Coordinate Descent: Connections, Insights, and Extensions you may find extension of the algorithm for $ m \geq 2 $ number of sets:
I wrote a MATLAB Code which is accessible in my StackExchange Mathematics Q1492095 GitHub Repository.
I implemented both methods and a simple test case which shows when the Alternating Projections Method doesn't converge to the Orthogonal Projection on of the intersection of the sets.
I also implemented 2 more methods:
- Quadratic form of the Hybrid Steepest Descent (See Quadratic Optimization of Fixed Points of Non Expensive Mappings in Hilbert Space). I implemented 2 variations of this.
- Consensus ADMM method which is related to the Dykstra algorithm from above.
While the Dykstra and ADMM methods were the fastest they also require much more memory than the Hybrid Steepest Descent method. So given one can afford the memory consumption the ADMM / Dykstra methods are to be chosen. In case memory is a constraint the Steepest Descent method is the way to go.
$$ \DeclareMathOperator{\sign}{sign} $$
The Lagrangian of the problem can be written as:
$$ \begin{align}
L \left( x, \lambda \right) & = \frac{1}{2} {\left\| x - y \right\|}^{2} + \lambda \left( {\left\| x \right\|}_{1} - 1 \right) && \text{} \\
& = \sum_{i = 1}^{n} \left( \frac{1}{2} { \left( {x}_{i} - {y}_{i} \right) }^{2} + \lambda \left| {x}_{i} \right| \right) - \lambda && \text{Component wise form}
\end{align} $$
The Dual Function is given by:
$$ \begin{align}
g \left( \lambda \right) = \inf_{x} L \left( x, \lambda \right)
\end{align} $$
The above can be solved component wise for the term $ \left( \frac{1}{2} { \left( {x}_{i} - {y}_{i} \right) }^{2} + \lambda \left| {x}_{i} \right| \right) $ which is solved by the soft Thresholding Operator:
$$ \begin{align}
{x}_{i}^{\ast} = \sign \left( {y}_{i} \right) { \left( \left| {y}_{i} \right| - \lambda \right) }_{+}
\end{align} $$
Where $ {\left( t \right)}_{+} = \max \left( t, 0 \right) $.
Now, all needed is to find the optimal $ \lambda \geq 0 $ which is given by the root of the objective function (Which is the constrain of the KKT Sytsem):
$$ \begin{align}
h \left( \lambda \right) & = \sum_{i = 1}^{n} \left| {x}_{i}^{\ast} \left( \lambda \right) \right| - 1 \\
& = \sum_{i = 1}^{n} { \left( \left| {y}_{i} \right| - \lambda \right) }_{+} - 1
\end{align} $$
The above is a Piece Wise linear function of $ \lambda $ and its Derivative given by:
$$ \begin{align}
\frac{\mathrm{d} }{\mathrm{d} \lambda} h \left( \lambda \right) & = \frac{\mathrm{d} }{\mathrm{d} \lambda} \sum_{i = 1}^{n} { \left( \left| {y}_{i} \right| - \lambda \right) }_{+} \\
& = \sum_{i = 1}^{n} -{ \mathbf{1} }_{\left\{ \left| {y}_{i} \right| - \lambda > 0 \right\}}
\end{align} $$
Hence it can be solved using Newton Iteration.
In a similar manner the projection onto the Simplex (See @Ashkan answer) can be calculated.
The Lagrangian in that case is given by:
$$ \begin{align}
L \left( x, \mu \right) & = \frac{1}{2} {\left\| x - y \right\|}^{2} + \mu \left( \boldsymbol{1}^{T} x - 1 \right) && \text{} \\
\end{align} $$
The trick is to leave non negativity constrain implicit.
Hence the Dual Function is given by:
$$ \begin{align}
g \left( \mu \right) & = \inf_{x \succeq 0} L \left( x, \mu \right) && \text{} \\
& = \inf_{x \succeq 0} \sum_{i = 1}^{n} \left( \frac{1}{2} { \left( {x}_{i} - {y}_{i} \right) }^{2} + \mu {x}_{i} \right) - \mu && \text{Component wise form}
\end{align} $$
Again, taking advantage of the Component Wise form the solution is given:
$$ \begin{align}
{x}_{i}^{\ast} = { \left( {y}_{i} - \mu \right) }_{+}
\end{align} $$
Where the solution includes the non negativity constrain by Projecting onto $ {\mathbb{R}}_{+} $
Again, the solution is given by finding the $ \mu $ which holds the constrain (Pay attention, since the above was equality constrain, $ \mu $ can have any value and it is not limited to non negativity as $ \lambda $ above).
The objective function (From the KKT) is given by:
$$ \begin{align}
h \left( \mu \right) = \sum_{i = 1}^{n} {x}_{i}^{\ast} - 1 & = \sum_{i = 1}^{n} { \left( {y}_{i} - \mu \right) }_{+} - 1
\end{align} $$
The above is a Piece Wise linear function of $ \mu $ and its Derivative given by:
$$ \begin{align}
\frac{\mathrm{d} }{\mathrm{d} \mu} h \left( \mu \right) & = \frac{\mathrm{d} }{\mathrm{d} \mu} \sum_{i = 1}^{n} { \left( {y}_{i} - \mu \right) }_{+} \\
& = \sum_{i = 1}^{n} -{ \mathbf{1} }_{\left\{ {y}_{i} - \mu > 0 \right\}}
\end{align} $$
Hence it can be solved using Newton Iteration.
I wrote MATLAB code which implements them both at Mathematics StackExchange Question 2327504 - GitHub.
There is a test which compares the result to a reference calculated by CVX.
Best Answer
If we consider the closed square in $\mathbb{R}^2$ defined by the four corners $(\pm 1, \pm 1)$ and $x_0 = (2,2)$ we see a tangent plane may not be well defined. However the convexity ensure a supporting hyperplane will exist. Since $v$ is necessarily on the boundary of $A$ there will be a hyperplane through it.
However a supporting hyperplane can be chosen orthogonal to $x_0-v$. Appealing to the example of the square again we see that $v=(1,1)$ and here the orthogonal tangent plane will be the line $(1,1) + (-1,1)t$. Note that this is orthogonal to $v-x_0=(-1,-1)$ since $\langle (-1,-1) , (-1,1)\rangle = 0$. So of the tangent lines that bound the square through the corner points there is one orthogonal to $v-x_0$.
This holds in generality and I'll leave it to you to prove it. You may also want to prove that if there is a well-defined tangent plane at $v$ then the tangent plane is a supporting hyperplane. The convexity of $A$ makes short work of the problem once you're familiar with the supporting hyperplane theorem.