The only assumption on $f$ which is identifiable in your question seems to be that it has a directional derivative at some point $w_0 \in l$ in the direction of a unit tangent $u$ to $l$ at $w_0$. Let us call this the minimal assumption. We shall see that it is too weak to get the desired result.
For a general function $f$ the level set $l = f^{-1}(k)$ may be any subset of $\mathbb R^2$. Even if $f$ is assumed to be continuous, $l$ may be any closed subset of $\mathbb R^2$, thus it still may be really weird. We must not think it is a $1$-dimensional $C^1$-submanifold of $\mathbb R^2$. If it is, this would be a strong additional assumption on $f$. A convenient condition assuring this is that $f$ is smooth and $k$ is a regular value of $f$. However, this would be far too restrictive.
Let us first clarify what it means to consider a tangent vector to $l$. In the general case the only reasonable interpretation seems to be the following:
Let $w_0 = (x_0,y_0) \in l$ and $\phi : (-a,a) \to \mathbb R^2$ be a $C^1$-curve such that $\phi(0) = w_0$ and $\phi((-a,a)) \subset l$. Then $\phi'(0)$ is a tangent vector to $l$ at $w_0$. If $\phi'(0) \ne 0$ we may re-parameterize $\phi$ to get a unit tangent vector. There may be many unit tangent vectors to $l$ at $w_0$. But if $l$ is a $1$-dimensional $C^1$-submanifold of $\mathbb R^2$, there is only one up to sign. It may also happen that zero is the only tangent vector, for example if $w_0$ is an isolated point of $l$.
The following two examples show that the minimal assumption is too weak.
Example 1: $$f(x,y) = \begin{cases}
0 & y = x^2 \\
x & y = 0 \\
1 & \text{else}
\end{cases}$$
This is a non-continuous function whose level set $l = f^{-1}(0)$ is the parabola $y = x^2$ (in particular it is a $1$-dimensional $C^1$-submanifold of $\mathbb R^2$). A unit tangent vector at $(0,0)$ to $P$ is $(1,0)$, and $f$ has a directional derivative at $(0,0)$ in direction $(1,0)$. We have $f(x,0) = x$, thus this directional derivative has the value $1 \ne 0$.
Example 2: Let $\Delta$ be the set of all $(x,y)$ such that $\lvert y \rvert = x^2$. This "double" parobola is not a $1$-dimensional $C^1$-submanifold of $\mathbb R^2$.
$$f(x,y) = \begin{cases}
x - \frac{\lvert y \rvert}{x} & x \ne 0, \lvert y \rvert \le x^2 \\
0 & x = y = 0 \\
d(\Delta,(x,y)) & \text{else}
\end{cases}$$
Here $d(\Delta,(x,y))$ denotes the Euclidean distance of $(x,y)$ to $\Delta$. Clearly $f$ is a continuous function whose level set $f^{-1}(0)$ is $\Delta$. It has two unit tangent vectors at $(0,0)$. Taking $(1,0)$, we see that $f$ has a directional derivative at $(0,0)$ in this direction. We have $f(x,0) = x$, thus this directional derivative has the value $1 \ne 0$.
I think these examples show that some differentiability assumption is necessary to obtain the desired result.
So let us assume that $f$ is differentiable at $w_0$ with derivative $df(w_0)$ (which is a linear map).
Then the directional derivative of $f$ at $w_0$ in any direction $\omega \in \mathbb R^2$ exists and has the value $df(w_0)(\omega)$. This is of course much stronger than assumimg that some directional derivative of $f$ exists.
Now let $\phi : (-a,a) \to \mathbb R^2$ be a $C^1$-curve such that $\phi(0) = w_0$ and $\phi((-a,a)) \subset l$. Its tangent vector at $w_0$ is $\omega = \phi'(0)$. We claim that $df(w_0)(\omega) = 0$ which is the desired result. It is trivial for $\omega = 0$. If $\omega \ne 0$, we know that $\lVert \frac{\phi(t) - \phi(0)}{t} \rVert > 0$ for $\lvert t \rvert < \epsilon$. Thus $\phi(t) \ne \phi(0)$ for $\lvert t \rvert < \epsilon$. We know that
$$\lim\limits_{w \to w_0} \frac {f(w) - f(w_0) - df(w_0)(w-w_0)} {\lVert w - w_0 \rVert} = 0 .$$
This implies
$$\lim\limits_{t \to 0} df(w_0)\left(\frac{\phi(t) -\phi(0)}{\lVert \phi(t) -\phi(0) \rVert} \right) = \lim\limits_{t \to 0} \frac {f(\phi(t)) - f(\phi(0)) - df(w_0)(\phi(t) -\phi(0))} {\lVert \phi(t) -\phi(0) \rVert} = 0 .$$
We know that $\lim\limits_{t \to 0} \frac{\phi(t) -\phi(0)}{t} = \omega$, thus $\lim\limits_{t \to 0} \lVert \frac{\phi(t) -\phi(0) }{t} \rVert = \lVert \omega \rVert$ and $\lim\limits_{t \to 0} \frac{\phi(t) -\phi(0)}{\lVert \phi(t) -\phi(0) \rVert} = \frac{\omega}{\lVert \omega \rVert}$. Hence
$$0 = \lim\limits_{t \to 0} df(w_0)\left(\frac{\phi(t) -\phi(0)}{\lVert \phi(t) -\phi(0) \rVert} \right) = df(w_0)\left(\lim\limits_{t \to 0}\frac{\phi(t) -\phi(0)}{\lVert \phi(t) -\phi(0) \rVert} \right) = df(w_0)\left(\frac{\omega}{\lVert \omega \rVert}\right) \\ = \frac{1}{\lVert \omega \rVert}df(w_0)(\omega) .$$
Best Answer
$\def\R{\mathbb{R}}\def\d{\mathrm{d}}$For your first question, because$$ g(x, y) = 0 \Longleftrightarrow \frac{∂g}{∂x}(x, y) \,\d x + \frac{∂g}{∂y}(x, y) \,\d y = 0, $$ if $∇g(x_0, y_0) ≠ (0, 0)$, then the tangent line of $C$ at $P_0$ is$$ \frac{∂g}{∂x}(x_0, y_0) (x - x_0) + \frac{∂g}{∂y}(x_0, y_0) (y - y_0) = 0. $$
For your second question, it is indeed not necessary to require $∇g(x_0, y_0) ≠ (0, 0)$ to ensure that $C$ have a tangent line at $(x_0, y_0)$. For example, if $g(x, y) = (x^2 - y)^2$ for $(x, y) \in \R^2$ and $P_0 = (0, 0)$, then $C = \{(t, t^2) \mid t \in \R\}$ has the tangent line $y = 0$ at $P_0$, but $∇g(0, 0) = (0, 0)$.
However, the condition $∇g(x_0, y_0) ≠ (0, 0)$ is usually imposed to ensure that $C$ does not bifurcate at $P_0$. When this condition is statisfied, at least one of the following proposition holds by the implicit function theorem:
which implies that $P_0$ is not a self-intersection point of $C$. For example, if$$ g(x, y) = (x^2 + y^2)^2 - (x^2 - y^2),\quad \forall (x, y) \in \R^2 $$ and $P_0 = (0, 0)$, then $∇g(0, 0) = (0, 0)$ and $C$ self-intersects at $P_0$.