Let me resolve a special case of the question, by leveraging the examples mentioned above. Suppose $(K,\rho)$ is a compact metric space and $X$ is the set of Borel probability measures on $K$ endowed with its weak* topology. I claim that $X$ has the convex function property if and only if $K$ is finite.
Theorem 10.2 in "Convex Analysis" by Rockafellar implies that any convex function defined on a finite-dimensional simplex is upper semicontinuous. This gives one direction.
Conversely, suppose $K$ is infinite. Letting $k_\infty$ be an accumulation point of $K$ (which exists because $K$ is an infinite compact metrizable space), define the affine continuous function $\varphi:X\to\mathbb R^2$ given by $\varphi(x):=\int_K \left(\rho(k,k_\infty), \rho(k,k_\infty)^2\right)\text{ d}x(k)$. Then, the convex function $f:X\to\mathbb R$ which takes $f(\delta_{k_\infty}):=0$ and $f(x):=\tfrac{\varphi_1(x)^2}{\varphi_2(x)}$ for every $x\neq\delta_{k_\infty}$ should work. A failure of continuity is witnessed along sequence $(\delta_{k_n})_{n=1}^\infty$ where, $\{{k_n}\}_{n=1}^\infty\subseteq K\setminus\{k_\infty\}$ is a sequence converging to $k_\infty$.
A natural conjecture is now that a general $X$ will have the convex function property if and only if $X$ has finitely many extreme points. Is this true?
Given a nonempty closed set $C\subseteq\mathbb{R}^n$, let $S_C\subseteq\mathbb{R}^n$ denote the set of points for which the closest point in $C$ is not unique. Suppose $C$ satisfies each of the following:
- $S_C$ is nonempty (i.e., $C$ is not convex),
- there exists $\rho>0$ such that $C^\rho\cap S_C=\emptyset$, and
- the closest point mapping $c\colon(\mathbb{R}^n\setminus S_C)\to C$ is continuous.
(Every nonempty closed nonconvex subset $C\subseteq\mathbb{R}^n$ with smooth boundary that I can think of satisfies these property.)
Let $r_C<\infty$ denote the supremum of such $\rho$. Below, we show that for every $\epsilon\in(0,r_C)$, it holds that $u_C$ is not Lipschitz on $\mathbb{R}^n\setminus (C^\epsilon\cup Z)$ for any null set $Z$.
Select $p\in\mathbb{R}^n$ of distance $r_C$ from $C$ for which there exist $x,y\in C$ with $x\neq y$ such that
$$
\|x-p\|=\|y-p\|=r.
$$
Then for each $t\in(0,1)$, the points $x(t):=tx+(1-t)p$ and $y(t):=ty+(1-t)p$ are of distance less than $r_C$ from $C$, and so $c(x(t))=x$ and $c(y(t))=y$. Furthermore,
$$
u_C(x(t))=\frac{x(t)-c(x(t))}{\|x(t)-c(x(t))\|}=\frac{x(t)-x}{\|x(t)-x\|}=\frac{p-x}{\|p-x\|},
$$
and similarly
$$
u_C(y(t))=\frac{p-y}{\|p-y\|}.
$$
Notably, for small $t$, $x(t)$ and $y(t)$ are arbitrarily close to each other, while $u_C(x(t))$ and $u_C(y(t))$ remain a fixed distance apart.
Given $\epsilon\in(0,r_C)$, a null set $Z$, and a constant $L>0$, we now show that $u_C$ is not $L$-Lipschitz on $\mathbb{R}^n\setminus (C^\epsilon\cup Z)$. Fix $\delta>0$ to be selected later. Select $t_0\in(0,1/2)$ small enough so that $x(t_0)$ and $y(t_0)$ avoid $C^\epsilon$ and $\|x(t_0)-y(t_0)\|<\delta$. Select a continuous path $q\colon[0,1]\to \mathbb{R}^n\setminus S_C$ such that $q(0)=x(t_0)$, $q(1)=y(t_0)$, and $q(s)\not\in Z$ for almost every $s\in(0,1)$. By the continuity of $c$, there exist $s_0,s_1\in[0,1]$ such that $q(s_0),q(s_1)\not\in Z$ and
$$
\max\Big\{\|q(s_0)-x(t_0)\|,\|q(s_1)-y(t_0)\|,\|c(q(s_0))-x\|,\|c(q(s_1))-y\|\Big\}<\delta.
$$
As we will see, $q(s_0)$ and $q(s_1)$ witness that $u_C$ is not $L$-Lipschitz on $\mathbb{R}^n\setminus (C^\epsilon\cup Z)$.
After some straightforward manipulations, we have
\begin{align*}
\|u_C(q(s_0))-u_C(x(t_0))\|
&=\bigg\|\frac{q(s_0)-c(q(s_0))}{\|q(s_0)-c(q(s_0))\|}-\frac{x(t_0)-x}{\|x(t_0)-x\|}\bigg\|\\
&\leq\frac{4\delta}{\|x(t_0)-x\|}=\frac{4\delta}{(1-t_0)r_C}\leq\frac{8\delta}{r_C},
\end{align*}
and similarly
$$
\|u_C(q(s_1))-u_C(y(t_0))\|\leq\frac{8\delta}{r_C}.
$$
Then
$$
\|u_C(q(s_0))-u_C(q(s_1))\|
\geq\|u_C(x(t_0))-u_C(y(t_0))\|-\frac{16\delta}{r_C}
=\frac{\|x-y\|-16\delta}{r_C},
$$
but
$$
\|q(s_0)-q(s_1)\|
\leq\|x(t_0)-y(t_0)\|+2\delta
\leq 3\delta.
$$
Selecting $\delta<\min\{\frac{1}{32},\frac{1}{6Lr_c}\}\cdot\|x-y\|$ then gives
$$
\|u_C(q(s_0))-u_C(q(s_1))\|>L\|q(s_0)-q(s_1)\|,
$$
as claimed.
Best Answer
(1) A necessary condition: For every $x\in\partial C$, it holds that the polar cone of $C-x$ is one-dimensional.
(The polar cone is always at least one-dimensional by part 4 of Prop 4 of these lecture notes.)
Suppose there exists $x\in\partial C$ such that the polar cone of $C-x$ has dimension at least $2$, and select distinct unit vectors $v$ and $w$ from this polar cone. Then $u(x+\epsilon v)=v$ and $u(x+\epsilon w)=w$ for every $\epsilon>0$. (This is part 2 of the same Prop 4.) Sending $\epsilon\to0$ gives arbitrarily close inputs whose outputs are a fixed distance apart, thereby breaking Lipschitz.
(2) A sufficient condition: $C$ is a compact sublevel set $\{x\in\mathbb{R}^n:f(x)\leq y_0\}$ of some twice continuously differentiable convex function $f\colon\mathbb{R}^n\to\mathbb{R}$ and $y_0>\min(f)$.
Indeed, since $y_0>\min(f)$, it holds that $\nabla f(x)\neq0$ for every $x\in\partial C$, and by compactness, there exists $\delta>0$ such that $\|\nabla f(x)\|>\delta$ for all $x\in\partial C$. Since $x\mapsto \nabla f(x)$ is Lipschitz on $\partial C$ (again, by compactness), it follows that the mapping $n\colon\partial C\to S^{n-1}$ defined by $n(x):=\nabla f(x)/\|\nabla f(x)\|$ is also Lipschitz. Finally, the nearest point mapping $c\colon(\mathbb{R}^n\setminus C)\to\partial C$ is $1$-Lipschitz, and so the composition $u=n\circ c$ is Lipschitz.