The total derivative is only a piece of notation to overcome some difficulties when dealing with Leibniz notation.
Consider the function $f(x,y)=x^2+y$. If you agree that $x$ and $y$ in the definition of $f$ are just placeholders you should agree that $f(y,x)=y^2+x$ and that $f(t,t^2)=2t^2$.
Now the question is: what does $\partial f / \partial x$ means? Written like that one would interpret $\partial/\partial x$ as the derivative of $f$ with respect to the first variable. Notice here that the variable $x$ is no more a simple place-holder but has a conventional meaning.
For example consider the following:
$$
\frac{\partial f(y,x)}{\partial x}.
$$
Now the interpretation is not clear... you mean the derivative with respect to the first or the second variable?
Mixing variables like that is not good... but sometimes one should be prepared to solve the ambiguity. Consider a function $f(x,t)$ which represent a quantity which depends on space $x$ and time $t$. So it is understood that $\partial f/\partial t$ is the derivative with respect to the second component (which is time). Suppose now that you have a particle which moves with the law $x=t^2$. If you evaluate the function $f$ on the particle you get
$$
f(x(t),t)
$$
and if you want to compute the derivative of this function you can use the chain rule and obtain:
$$
\frac{d}{dt} f(x(t),t) = \frac{\partial}{\partial x} f(x(t),t)\cdot x'(t) + \frac{\partial }{\partial t} f(x(t),t).
$$
Now the point is that often it is useful to reduce the notation writing $x$ in place of $x(t)$ and $f$ in place of $f(x,t)$ so that previous formula could be written as
$$
\frac{d}{dt} f(x,t) = \frac{\partial}{\partial x} f(x,t) \cdot x'(t) + \frac{\partial }{\partial t} f(x,t)
$$
or
$$
\frac{df}{dt} = \frac{\partial f}{\partial x} \cdot x' + \frac{\partial f}{\partial t}.
$$
Now you see that $d/dt$ and $\partial/\partial t$ assume different meanings...
In the standard derivation of the Jacobian, there is no need to consider the particular basis of the domain and codomain under consideration. So, the literal answer to "where do we use the fact that these bases are standard" is "nowhere".
I think that your implicit question is the following: given bases $\mathcal B_1 = \{v_1,\dots,v_m\}$ of $\Bbb R^m$ and $\mathcal B_2 = \{w_1,\dots,w_n\}$ of $\Bbb R^n$, how would we compute the matrix of $df(x)$ relative to these bases? With that, it is easy to say why it is that the standard basis yields the Jacobian matrix.
When discussing the entries of a matrix, it is convenient to introduce dual bases. We say that a set $\mathcal B_2^* = \{\beta_1,\dots,\beta_n\}$ of linear functions $\beta_i:\Bbb R^n \to \Bbb R$ is the dual basis to $\mathcal B_2$ if we have
$$
\beta_i(w_j) = \begin{cases}
1 & i=j\\
0 & i \neq j.
\end{cases}
$$
The role of the dual basis is to extract the individual components of a vector in $\Bbb R^n$ corresponding to the elements of $\mathcal B_2$. In other words, if $w = x_1 w_1 + \cdots + x_n w_n$, then $\beta_i(w) = x_i$ (for $i = 1,\dots,n$).
For a linear map $L:\Bbb R^m \to \Bbb R^n$, the $i,j$ entry of the matrix $[L]^{\mathcal B_1}_{\mathcal B_2}$ (of $L$ relative to the bases $\mathcal B_1, \mathcal B_2$) is given by $\beta_i(L(v_j)).$
Note that the directional derivative of $f$ along the vector $v$ is given by the limit
$$
D_vf(x) = \lim_{t \to 0} \frac{f(x + tv) - f(x)}{t}
$$
Usually, when this concept is introduced $f$ is necessarily scalar valued and $v$ is necessarily a unit vector, but this need not hold in our case. From your definition, we are given that
$$
f(x+h) - f(x) = df(x)h + o(\|h\|),
$$
So the above limit becomes
$$
D_vf(x) = \lim_{t \to 0} \frac{f(x + tv) - f(x)}{t} = \lim_{t \to 0}\frac {tdf(x)(v) - o(t)}{t} = df(x)(v).
$$
With that stated, the $i,j$ entry $m_{ij}$ of $M = [df(x)]^{\mathcal B_1}_{\mathcal B_2}$ is given by
$$
m_{ij} = \beta_i(df(x)(v_j)) = \beta_i[D_{v_j}f(x)]
$$
So, what's special about the standard bases? If we take $\mathcal B_1$ to be the standard basis, then the directional derivative $D_{v_j}f(x)$ is simply the partial derivative of $f$ with respect to $x_j$. That is, we have
$$
m_{ij} = \beta_i[D_{v_j}f(x)] = \beta_i[\partial_j f].
$$
If we then take $\mathcal B_2$ to be the standard basis, then the dual basis $\mathcal B_2^*$ is simply the set of component functions $\beta_i(x_1,\dots,x_n) = \beta_i$. That is, we have
$$
m_{ij} = \beta_i[\partial_j f] = \partial_j f^i.
$$
So, when $\mathcal B_1,\mathcal B_2$ are the standard bases, then $[df(x)]_{\mathcal B_2}^{\mathcal B_1}$ is the Jacobian matrix of $f$.
An alternative approach: as we established above, the $i,j$ entry of the matrix $M = [df(x)]^{\mathcal B_1}_{\mathcal B_2}$ is given by
$$
m_{ij} = \beta_i(df(x)(v_j)) = (\beta_i \circ df(x))(v_j).
$$
In the case where $\mathcal B_2$ is the standard basis, the dual basis elements are simply the standard projection functions $\beta_i = \pi^i$ (as you correctly note in your comment below). I claim that
$$
\pi^i \circ df(x) = df^i(x).
$$
To show that this is the case, we can show that $df^i(x)$ satisfies the requirement from your definition:
$$
(\pi^i \circ f)(x + h) - (\pi^i \circ f)(x) =
f^i(x + h) - f^i(x) = df^i(x)(h) + o(\|h\|).
$$
From there, we have
$$
m_{ij} = (\beta_i \circ df(x))(v_j) = df^i(x)(v_j).
$$
Now, we can note that for any scalar-valued function $g$, $dg(x)(v)$ is simply the directional derivative of $g$ along $v$. Indeed, we note that
$$
D_vg(x) = \lim_{t \to 0} \frac{g(x + tv) - g(x)}{t} = \lim_{t \to 0}\frac {tdg(x)(v) + o(t)}{t} = dg(x)(v).
$$
Now, if $\mathcal B_1 = \{e_1,\dots,e_n\}$ is the standard basis of $\Bbb R^n$, then we have
\begin{align}
m_{ij} &= df^i(x)(e_j) = D_{e_j}f^i = \partial_j f^i.
\end{align}
Best Answer
Per Definition of the Derivative we have $$ g(p+u) = g(p)+ Dg_pu + o(u). $$ Where i am using the little o notation for the Derivative.
Now take $u = e_j t$ for $t> 0$ and some $j \in \{1, \dots, n\}$. Then (using the above and the linearity of $Dg_p$) $$\lim_{t\to 0} \frac{g(p+e_jt)-g(p)}{t} = Dg_p e_j.$$ Let $i \in\{1, \dots, m\}$. Since $f^*_i$ is continuous and linear we receive: $$ \partial_j g_i (p)= f^*_i Dg_p e_j $$ and so the matrix representation of $Dg_p$ wrt standard basis is the matrix $(\partial_j g_i (p))_{i,j}$, which is the Jacobian.