First step: we need to define the trace of an endomorphism $f:V\to V$ where $\dim V=n$. One way is to take a basis $\beta=\{e_1,\dots, e_n\}$ of $V$, consider the associated matrix $[f]_{\beta}$, and define
\begin{align}
\text{trace}(f):=\text{trace}([f]_{\beta}):=\sum_{i=1}^n([f]_{\beta})_{ii},
\end{align}
i.e the sum of the diagonals of the matrix-representation of $f$. This result doesn't depend on the choice of basis (if you use a different basis $\gamma$, then $[f]_{\gamma}=P[f]_{\beta}P^{-1}$ for some invertible matrix $P$; i.e they're related by similarity, and now using the cyclic property of traces ($\text{trace}(AB)=\text{trace}(BA)$) the well-definition follows). Using the isomorphism $\text{End}(V)\cong T^1_1(V)$, we see that $\text{trace}:\text{End}(V)\to\Bbb{R}$ induces a mapping (which by slight abuse of language we still refer to as 'trace') $\text{trace}:T^1_1(V)\to\Bbb{R}$. If you carry out this isomorphism, you'll see that it amounts to taking a basis $\{e_1,\dots, e_n\}$ of $V$, the dual basis $\{\epsilon^1,\dots, \epsilon^n\}$ of $V^*$, and then the trace of a $(1,1)$ tensor $F$ is
\begin{align}
\text{trace}(F)&=\sum_{i=1}^nF(\epsilon^i,e_i).
\end{align}
So, in the above paragraph, we've defined the trace of a $(1,1)$ tensor. A natural question arises as to whether we can define an analogous operation for higher order tensors. Let $F$ be a $(k+1,l+1)$ tensor in your notation, where $k,l\geq 0$. This means $F$ is a multilinear map $(V^*)^{l+1}\times V^{k+1}\to\Bbb{R}$. Let us fix two integers $i$ and $j$ such that $1\leq i\leq l+1$ and $1\leq j\leq k+1$. We can now define a map $C_{ij}:T^{k+1}_{l+1}(V)\to T^k_l(V)$, which we shall call the '$i,j$ contraction map', whose definition is: you take $\omega^1,\dots, \omega^l\in V^*$ and $v_1,\dots, v_k\in V$, and define $C_{ij}(F)\in T^k_l(V)$ such that its numerical value on these guys is
\begin{align}
\text{trace}\bigg( F(\omega^1,\dots, \omega^{i-1}, \underbrace{\star}_{\text{$i^{th}$ covector slot}}, \omega^i,\dots, \omega^l,
v_1,\dots, v_{j-1},\underbrace{\star}_{\text{$j^{th}$ vector slot}},v_j,\cdots, v_k)\bigg).
\end{align}
In words, we take $k$ vectors $v_1,\dots, v_k$ and $l$ covectors $\omega^1,\dots, \omega^l$, and we feed it inside of $F$ (which has $l+1$ open slots for covectors and $k+1$ open slots for vectors) such that we leave the $i^{th}$ covector slot empty, and the $j^{th}$ vector slot empty. With these two slots left open, we now have a $(1,1)$ tensor, so by my first paragraph, you can take the trace and get a number.
So that's the definition. Here I've given this mapping the name $C_{ij}$ to mean the '$i,j$ contraction', but it's also common to call it $\text{tr}_{ij}$ to mean the trace over the $i,j$ slots. Often, we may dispense with notation like $C_{ij}$ or $\text{tr}_{ij}$, and simply say in words "take the trace/contraction of the tensor $F$ over its $i^{th}$ covector and $j^{th}$ vector slots".
For concreteness, lets say $F$ is a $(3,2)$ tensor, meaning a multilinear map $F:V^*\times V^*\times V\times V\times V\to\Bbb{R}$. And say I want to take the trace over the first covector slot and the second vector slot (i.e $C_{12}$ or $\text{tr}_{12}$). Then, $\text{tr}_{12}(F):V^*\times V\times V\to\Bbb{R}$ is the map such that for all $\omega\in V^*,u,v\in V$,
\begin{align}
(\text{tr}_{12}F)(\omega, u,v)&:=\text{trace}\bigg(F(\star,\omega, u,\star, v)\bigg)=\sum_{i=1}^nF(\epsilon^i,\omega,u,e_i,v)
\end{align}
For a slightly more abstract perspective on traces, see this answer of mine. The point is we can take any number of vector spaces $V_1,\dots, V_p$, and form the tensor product space $V_1\otimes\cdots\otimes V_p$. As long as we have one copy of $V$ and one copy of $V^*$ in the tensor product (i.e there exist distinct indices $i,j\in\{1,\dots, p\}$ such that $V_i=V_j^*$), we can define a trace/contraction mapping over those spaces, thereby obtaining a linear map $V_1\otimes\cdots\otimes V_p\to
V_1\otimes\cdots\widehat{V_i}\otimes\cdots\otimes \widehat{V_j}\otimes\cdots\otimes V_p$, where the hat means omit that space in the tensor product.
We can generalize this idea further. Suppose $V_1,\dots, V_p$ are any vector spaces. Suppose we fix distinct indices $i,j$, and that we have a bilinear map $\mu:V_i\times V_j\to\Bbb{R}$. Then, we can define a 'contraction with respect to $\mu$' to be the unique linear map $\tilde{\mu}:V_1\otimes\cdots\otimes V_p\to
V_1\otimes\cdots\widehat{V_i}\otimes\cdots\otimes \widehat{V_j}\otimes\cdots\otimes V_p$ such that for all pure tensors, we have
\begin{align}
\tilde{\mu}(v_1\otimes\cdots\otimes v_p)&=\mu(v_i,v_j)\cdot v_1\otimes\cdots
\otimes\widehat{v_i}\otimes\cdots\otimes\widehat{v_j}\otimes\cdots\otimes v_p.
\end{align}
The previous paragraph was the special case where $\mu:V\times V^*\to\Bbb{R}$ is the evaluation mapping on a pair of vector spaces.
Let $(e_i)$ be a local orthonormal frame (so that $\left< e_i, e_j \right> = \delta_{ij}$). Write $\nabla_{e_i} e_j = \Gamma_{ij}^k e_k$ and $\nabla^{*}_{e_i} e_j = \tilde{\Gamma}_{ij}^k e_k$. Plugging $X = e_i, Y = e_j, Z = e_k$ into the defining equation of the conjugate connection, we get that
$$ 0 = e_i \left< e_j, e_k \right> = \left< \nabla_{e_i} e_j, e_k \right>
+ \left< e_j, \nabla^{*}_{e_i} e_k \right> = \Gamma_{ij}^k + \tilde{\Gamma}_{ik}^j. $$
Hence, if you know the coefficients of $\nabla$ with respect to an orthonormal frame you can easily compute the coefficients of $\nabla^{*}$ with respect to the same frame by the formula $\tilde{\Gamma}_{ik}^j = -\Gamma_{ij}^k$. In particular, you also see that $\nabla^{**} = \nabla$ as expected.
Working with orthonormal frame allows you also to see that the conjugate connection $\nabla^{*}$ is in fact equal to the dual connection $\nabla^{\sharp}$ if you identify vectors and covectors using the metric. More explicitly, let $e^i$ be the dual basis of the frame $(e_i)$. Since $(e_i)$ is orthonormal, the dual frame $(e^i)$ is identified with the frame $(e_i)$ under the musical isomorphisms. The dual connection $\nabla^{\sharp}$ is defined by the similar equation
$$ X \{ \omega, Z \} = \{\nabla^{\sharp}_X(\omega), Z \} + \{ \omega, \nabla_X Z \} $$
where $X,Z$ are vectors fields, $\omega$ is a covector field and $\{ \cdot, \cdot \}$ is the natural pairing (i.e $\{ \omega, Z \} = \omega(Z)$).
Plugging in $X = e_i, Z = e_j$ and $\omega = e^k$ and using the fact that $\{ \omega, Z \} = e^k(e_j) = \delta^k_j$ is constant, one immediately gets that
$$ 0 = \{ \nabla^{\sharp}_{e_i}(e^k), e_j \} + \{ e^k, \nabla_{e_i} e_j \} =
\{ \nabla^{\sharp}_{e_i}(e^k), e_j \} + \Gamma_{ij}^k. $$
Hence
$$\nabla^{\sharp}_{e_i}(e^k) = \{ \nabla^{\sharp}_{e_i}(e_k), e_j \}e^j = -\Gamma_{ij}^k e^j $$
so the coefficients of the dual connection with respect to the coframe $e^i$ are also given by the same formula.
Best Answer
First of all, Lee is a pure mathematician, while Nielsen is an applied mathematician. As it frequently happens with applied mathematicians, Nielsen is loose with terminology. He also made one typo in the quoted passage. His use of notation is also different from Lee's (whose notation are more standard).
In more detail: Nielsen does not explain which space the objects called $e^{*i}$ belong to, but, from the context, they belong to the vector space $V$ (the tangent space) and not $V^*$ (the cotangent space). I assume, he also uses (elsewhere in the book?) genuine covectors $e^i\in V^*$. In his picture, however, he forgot to put $*$ in $e^i$ (this would be a typo). The relation between $e^{*i}\in V$ and $e^i\in V^*$ then would be that $$ \langle e^{*i}, v\rangle= e^i(v)$$ for every $v\in V$. However, it is also possible that he simply refuses to use actual covectors (differential 1-forms) in the book and leaves it unexplained what the symbols $dx^i$ stand for (if he ever uses differential forms in the book).
He notes (correctly) that when $$ v=\sum_i v^i e_i= \sum_i v_ie^{*i}, $$ one can recover the coefficients $v^i$ and $v_i$ using the inner product in $V$: $$ v^i= \langle v, e^{*i}\rangle $$ $$ v_i= \langle v, e_i\rangle $$
Edit. Regarding your question in the edit. The Riemannian metric defines "musical" isomorphisms between tangent and cotangent spaces $$ \flat: T_pM\to T^*_pM, \sharp: T^*_pM\to T_pM. $$ (One should not call these isomorphisms $g$.) What he (Nielsen) is using is that $\sharp(e^i)=e^{*i}$ and not that $\sharp(e^{*i})=e_{i}$ (the latter would be nonsense indeed). Here, I am using the notation $\{e^i: i=1,...,n\}$ for the basis of $V^*$ which is dual to the basis $\{e_i: i=1,...,n\}$ of $V$. Admittedly, he never introduced the notation $e^i$, but this is what he is implicitly using.