People have pointed out that my hand-waving does not necessarily correspond to a proof. It was an absolute delight to revisit this proof. I have a new proof below, and it does not rely on the hand-wavy "minimality" hypothesis at all.
Let $x_i\in M_i$, and suppose $x_i\in D$. As you have noted, $x_i$ is a finite $A$-linear combination of elements of $C$ of the form $x_a-\mu_{ab}(x_a)$. Absorbing the coefficients from $A$ in the terms $x_a-\mu_{ab}(x_a)$, we get terms of the same form. So, let $i_1,i_2,\dots,i_k,j_1,j_2,\dots,j_k\in I$, $x_{(1)}\in M_{i_1}\dots,x_{(k)}\in M_{i_k}$ and suppose $i_r\le j_r$ for $r=1,2,\dots,k$ as well as
$$x_i = (x_{(1)}-\mu_{i_1j_1}(x_{(1)}))+\dots+(x_{(k)}-\mu_{i_kj_k}(x_{(k)})).\qquad(1)$$
Since $\{i,j_1,j_2,\dots,j_k\}$ is a finite subset of $I$, which is directed, there exists $j_{\ast}$ such that $i\le j_{\ast}$ and $j_r\le j_{\ast}$ for $r=1,2,\dots,k$. Also, for $r=1,2,\dots,k$, we have $i_r\le j_r$, so $i_r\le j_{\ast}$.
For $a\in I$, let $\pi_{a}:C\to M_{a}$ be the homomorphism given by restricting the canonical projection $\prod_{b\in I}M_b\to M_{a}$ to $C$. On the one hand, we have
$$\pi_a(x_i) = \begin{cases} x_i&\text{if $a=i$} \\
0 &\text{if $a\ne i$}\end{cases}$$
Based on (1), we also find that
$$\pi_a(x_i) = \sum_{i_b=a}x_{(b)} - \sum_{j_c=a}\mu_{i_cj_c}(x_{(c)}).$$
Then,
$$\mu_{aj_{\ast}}(\pi_a(x_i)) = \sum_{i_b=a}\mu_{i_bj_{\ast}}(x_{(b)})-\sum_{j_c=a}\mu_{j_cj_{\ast}}(\mu_{i_cj_c}(x_{(c)}))$$
Summing over $a$, we have
$$\sum_{a\in I}\mu_{aj_{\ast}}(\pi_a(x_i)) = \mu_{i_1j_{\ast}}(x_{(1)})-\mu_{j_1j_{\ast}}(\mu_{i_1j_1}(x_{(1)}))+\dots+\mu_{i_kj_{\ast}}(x_{(k)})-\mu_{j_kj_{\ast}}(\mu_{i_kj_k}(x_{(k)})).$$
The left side is $\mu_{ij_{\ast}}(\pi_i(x_i))$, and the right side is $0$. Therefore,
$$\mu_{ij_{\ast}}(x_i) = 0.$$
Note that the tensor product of algebras is in fact the coproduct in the category of $A$-algebras. What AM defines there should therefore satisfy the universal property of an infinite coproduct. I am not sure about the universal property with respect to multilinear maps of the underlying modules though. I believe one can derive something like this, yet one has to ask oneself, if it would be worth the effort.
I recall doing this construction in some exercise quite a while ago and would prefer not to reconsider how the elements look like. So I am afraid I wont be of much use regarding your second question. Maybe someone else can help out there. Personally I strive to do as much as possible using universal properties only...
Best Answer
One way is to use the adjunction property of the tensor product. For any $R$-module $P$, we have the following sequence of canonical $R$-module isomorphisms:
$$\begin{eqnarray}\hom(\varinjlim(M_i \otimes N), P) \cong&& \varprojlim \hom(M_i \otimes N, P) \\ \cong && \varprojlim \hom(N, \hom(M_i, P)) \\ \cong && \hom(N, \varprojlim \hom(M_i, P)) \\ \cong&& \hom(N,\hom(\varinjlim M_i, P)) \\ \cong&& \hom((\varinjlim M_i) \otimes N, P)\end{eqnarray}$$
and thus $$\hom(\varinjlim(M_i \otimes N), -) \cong \hom((\varinjlim M_i) \otimes N, -).$$
By the Yoneda lemma, $$\varinjlim(M_i \otimes N) \cong (\varinjlim M_i) \otimes N.$$
Of course, I made no use of the properties of the tensor product, other than its left-adjointness. The same argument shows that a functor which is left adjoint commutes with direct limits. Dually, a functor which is right adjoint commutes with inverse limits.
The proof which A&M suggest is basically just an expanded version of the same argument - see if you can figure it out!
Addendum: I'm not going to write the detailed proof from first principles (it's long) but I'll expand a bit on the hint in A&M. Here is the hint as it is in my edition, where they write $P=\varinjlim(M_i \otimes N)$ and $M=\varinjlim M_i$:
Here $\phi: P \to M\otimes N$ has been defined earlier as the limit of all the canonical homomorphisms $\mu_i\otimes 1 : M_i\otimes N \to M \otimes N$ where $\mu_i$ is the canonical homomorphism $M_i \to M$.
It's a pretty good hint! The only difficulty is the part in bold. We do not know how to take the limit of the $M_i \times N$ because we are not considering those as $R$-modules in the construction of the tensor product. We are considering $M_i \times N$ as a suitable domain for bilinear maps. Obtaining the map $\psi$ will be done in a few stages:
Then you must verify that $\psi$ and $\phi$ are inverses to each other.