Suppose $f \in W^{1,1}_{loc}(U)$. Then no, since for such an $f$, we have that $Df$ exists and the approximate limit
$ap\lim_{y\to x} \frac{f(x)-f(y)-Df(x)(x-y)}{|x-y|} = 0$
exists for almost every $x$, while from assuming classical differentiability we have
$\lim_{y\to x} \frac{f(x)-f(y)-\nabla f(x)(x-y)}{|x-y|} = 0$
exists for every $x \in U$. In particular, the classical differential is a candidate for the approximate differential, and so $Df=\nabla f$ wherever the two exist, and hence in $U$ up to a set of measure zero.
http://www.encyclopediaofmath.org/index.php/Approximate_differentiability
No. Take a dense countable set $\{x_1,x_2,\dots\}$ in $\mathbb{R}^d$ and a sequence $(r_i)\subseteq\mathbb{R}^+$ such that $\sum_i r_i^{d-1}<\infty$. Then the function
$$f=1_{\bigcup_{i=1}^\infty B_{r_i}(x_i)}$$
is in $BV(\mathbb{R}^d)$ (since $|\bigcup B_{r_i}(x_i)|\le C\sum_i r_i^d$ and $f$ is the limit in $L^1$ of the functions $1_{\bigcup_{i=1}^k B_{r_i}(x_i)}$, whose gradients have total variation bounded by $C\sum_i r_i^{d-1}<\infty$).
Now, for any Lebesgue point $x_0$ of $f$ in the closed set $\{f=0\}$, no representative $g$ is continuous at $x_0$ (representative means a function which coincides a.e. with $f$). Indeed, $x_0$ lies in the closure of the open set $\bigcup B_{r_i}(x_i)$, so it belongs to the closure of $\{g=1\}$. On the other hand, since $x_0$ is a Lebesgue point for $f$, it must also belong to the closure of $\{g=0\}$. This shows that $g$ is not even a.e. continuous (since the set $\{f=0\}$ has positive measure).
Addendum.
The answer is still no even assuming $f$ continuous. Below I construct an example where the differentiability of $f$ fails on a Borel set of positive measure.
Choose a countable dense set $\{x_i\}$ in $B_1(0)$ and a sequence $r_i>0$ such that $\sum_i r_i^{d-1}<\infty$ and $\sum_i|B_{r_i}(x_i)|<|B_1(0)|$. In particular, $r_i\to 0$. Using Besicovitch covering theorem, up to a subsequence we can assume that the balls $B_{r_i}(x_i)$ have bounded overlapping (i.e. any point lies in at most $N$ such balls); in doing this, we could lose the density of $\{x_i\}$ but we still have $B_1(0)\subseteq\overline{\cup_i B_{r_i}(x_i)}$.
Let $S:=B_1(0)\setminus\cup_i\overline{B_{r_i}(x_i)}$. We remark that $|S|>0$ and that, for any $N\ge 1$, $S\subseteq\overline{\{x_i\mid i>N\}}$. It follows that we can find a sequence of positive radii $R_i\to 0$ such that
$S\subseteq\cup_{i\ge j}B_{R_i}(x_i)$ for all $j\ge 1$: by compactness, we can find $n_1>0$ such that
$$S\subseteq\overline{\{x_i\mid i>0\}}\subseteq\cup_{i=1}^{n_1}B_1(x_i),$$
then we use the above remark with $N=n_1$ and we find $n_2>n_1$ such that
$$S\subseteq\overline{\{x_i\mid i>n_1\}}\subseteq\cup_{i=n_1+1}^{n_2}B_{1/2}(x_i),$$
and so on.
Now the function $f(x):=\sum_i R_i(1-r_i^{-1}|x-x_i|)^+$ (a superposition of 'traffic cones' with heights $R_i$ placed on our balls $B_{r_i}(x_i)$) is continuous, as it is a uniform limit of continuous functions, thanks to the bounded overlapping. It also lies in $BV(\mathbb{R}^d)$ thanks to the assumption $\sum_i r_i^{d-1}<\infty$.
I claim that $f$ cannot be differentiable at $x$, for any $x\in S$. Indeed, as $x$ is a minimum point for $f$, we would have $\nabla f(x)=0$. But there is a sequence
$i_k\to\infty$ with $x\in B_{R_{i_k}}(x_{i_k})$, so $f(x_{i_k})\ge R_{i_k}\ge|x-x_{i_k}|$, which contradicts $\nabla f(x)=0$ (since $x_{i_k}\to x$).
Best Answer
Your formula can be wrong even if $f$ and $g$ are both Lipschitz. For criteria when such a result holds (and related results) see e.g. Leoni, Giovanni, Morini, Massimiliano: Necessary and sufficient conditions for the chain rule in $W^{1,1}_{loc}(ℝ^N;ℝ^d)$ and $BV_{loc}(ℝ^N;ℝ^d)$. J. Eur. Math. Soc. (JEMS) 9 (2007). https://mathscinet.ams.org/mathscinet-getitem?mr=2293955
They give the counterexample $g(y_1,y_2)=\max(y_1,y_2)$, $f(x)=(x,x)$, where the right hand side is nowhere defined as $f'(x)=(1,1)$ everywhere but $g$ is not differentiable at $y_1=y_2$.