[Tex/LaTex] Weird “\bfseries invalid in math mode” problem

errors

I am compiling an Appendix file as part of my report. I have the file complete and the funny thing is when I compile the main file, it works fine, but if I make a mistake that somehow leads to 'bfseries invalid in math mode', I will never get out of it. I have not used bfseries explicitly anywhere in my code, and I have only used mathbf once or textbf twice in math and text mode, respectively. Once the error occurs once, I have to delete part of the code in Appdendix I was writing when the error initially popped up, and rerun the code, once it works, i copy back the deleted section and it all works again.

What could be wrong?

I have posted the section here, it's a bit lengthy. I am not sure how to get a minimum working example. Well that's because it's quite hard to see where actually went wrong.

The thing is this section does actually work. But when that error I mentioned pops up, I'll have to delete this particular section -> run -> add back the section -> run, and it works again.

\textbf{Calculation of} $\mathrm{H} [p(m_{ij} | \mathcal{D})]$ \\
Since $m_{ij}$ is a binary outcome, $\mathrm{H}[...]$ is just a binary entropy function. The difficult part lies in computing $p(m_{ij} | \mathcal{D})$, the predictive distribution for $m_{ij}$. According to \citep[eq.(4.153)]{ref2}, we can approximate $p(m_{ij} | \mathcal{D})$ using the following result:
\begin{align} %first line of equations
p(m_{ij}=1|\mathcal{D}) &= \iint p(m_{ij}=1|U,V,\mathcal{D}) p(U,V|\mathcal{D}) \mathrm{d}U \mathrm{d}V \nonumber\\ %equation 1
&= \iint \sigma(u_i v_j^T) p(U,V|\mathcal{D}) \mathrm{d}U \mathrm{d}V \ldots \text{from } \eqref{eqn:pmuvdef} \nonumber\\
&\approx \iint \sigma(u_i v_j^T) Q(U,V) \mathrm{d}U \mathrm{d}V \ldots \text{definition of } Q(U,V) \nonumber\\
&\approx \int \sigma(t) \mathcal{N} (t | \mu_0, \sigma_0^2) \mathrm{d}t \ldots \text{let } t=u_i v_j^T \nonumber\\
&\approx \sigma(\kappa(\sigma_0^2) \mu_0) \ldots \text{from } \citep[\text{eq.}(4.153)]{ref2} \label{eqn:pm1} \\
\text{where }& %equation 2
\begin{cases}
\kappa(\sigma_0^2) = (1 + \frac{\pi \sigma_0^2}{8})^{-\frac{1}{2}} \ldots \text{from } \citep[\text{eq.}(4.153)]{ref2} \\
\mu_0 = {\bar{u}}_i {\bar{v}}_j^T \ldots \text{from } \eqref{eqn:uvdistribution} \\
\sigma_0^2 = \mathrm{Trace}[({\bar{u}}_i^T {\bar{u}}_i + \Psi_i) ({\bar{v}}_j^T {\bar{v}}_j + \Phi_j)] \ldots \text{from } \eqref{eqn:uvdistribution}
\end{cases} \label{eqn:pm1c}
\end{align}
Now it is easy to calculate the binary entropy function:
\begin{align} %first line of equations
\mathrm{H}[p(m_{ij} | \mathcal{D})] &= - p(m_{ij}=1|\mathcal{D}) \log{p(m_{ij}=1|\mathcal{D})} - (1-p(m_{ij}=1|\mathcal{D})) \log{(1-p(m_{ij}=1|\mathcal{D}))} \label{eqn:entropy1} \\
\text{where }& p(m_{ij} = 1 | \mathcal{D}) \text{ can be computed using } \eqref{eqn:pm1} \text{ and } \eqref{eqn:pm1c} \nonumber \text{.}
\end{align}
\textbf{Calculation of} $\mathbb{E}_{p(U,V|\mathcal{D})} \left[ \mathrm{H}[p(m_{ij}|U,V,\mathcal{D})] \right] $ \\
Although $\mathrm{H}[p(m_{ij}|U,V,\mathcal{D})]$ can be obtained exactly, the fact that it involves logistic functions means it is impossible to integrate it w.r.t. $p(U,V|\mathcal{D})$. A practical approach, mentioned in \citep[p. 2, Supplementary Material]{ref3}, can be used to approximate the likelihood probability into an exponential form.

Similar to \eqref{eqn:entropy1}, we have:
\begin{equation} \label{eqn:entropylike}
\mathrm{H}[p(m_{ij} | U,V,\mathcal{D})] = - \sigma(u_i v_j^T) \log{\sigma(u_i v_j^T)} - (1-\sigma(u_i v_j^T)) \log{(1-\sigma(u_i v_j^T))}
\end{equation}
Hence $\mathrm{H}[p(m_{ij}|U,V,\mathcal{D}]$ is a function of $\sigma(u_i v_j^T)$ only:
\begin{align} %first line of equations
\text{Define }& %equation 1
\begin{cases}
h(\sigma(x_{ij})) &= \mathrm{H}[p(m_{ij}|U,V,\mathcal{D})] \ldots x_{ij}=u_i v_j^T \\ 
f(x_{ij}) &= \log{h(\sigma(x_{ij}))}
\end{cases} \\
\Rightarrow f(x_{ij}) &= f(0) + \frac{f'(0) x_{ij}}{1!} + \frac{f''(0) x_{ij}^2}{2!} + \ldots \leftarrow \text{Taylor expansion} \label{eqn:taylor} %equation 2
\end{align}
And from \eqref{eqn:entropylike}:
\begin{align} %first line of equations
f'(x_{ij}) &= -\frac{\sigma'(x_{ij})}{h(\sigma(x_{ij}))} (\log{\sigma(x_{ij})} - \log{[1-\sigma(x_{ij})]}) \nonumber\\ %equation 1
\begin{split}
f''(x_{ij}) &= -\frac{\sigma'(x_{ij})}{h(\sigma(x_{ij}))} (\frac{\sigma'(x_{ij})}{\sigma(x_{ij})} + \frac{\sigma'{(x_{ij})}}{[1-\sigma(x_{ij})]}) \\
& \quad + \left( \log{\sigma(x_{ij})} - \log{[1-\sigma(x_{ij})]} \right) \frac{-h(\sigma (x_{ij})) \sigma''(x_{ij}) + \sigma'(x_{ij}) h'(\sigma (x_{ij}))}{h^2 (\sigma(x_{ij}))}
\end{split} \nonumber\\ %equation 2
\Rightarrow&
\begin{cases} %equations 3
f(0) = \log{\log{2}} \\
f'(0) = 0 \\
f''(0) = -\frac{1}{4\log{2}}
\end{cases} \label{eqn:f0set}
\end{align}
Using \eqref{eqn:taylor} and \eqref{eqn:f0set}, $f(x_{ij})$ can be approximated up to second order:
\begin{align} %first line of equations
f(x_{ij}) &\approx f(0) + \frac{f'(0) x_{ij}}{1!} + \frac{f''(0) x_{ij}^2}{2!} \nonumber\\ %equation 1
&\approx \log{\log{2}} - \frac{1}{8\log{2}} x_{ij}^2 \nonumber\\
\Rightarrow \log{h(\sigma(x_{ij}))} &\approx \log{\log{2}} - \frac{1}{8\log{2}} x_{ij}^2 \nonumber\\ %equation 2
\Rightarrow h(\sigma(x_{ij})) &\approx \log{2} \exp{[- \frac{x_{ij}^2}{8\log{2}}]} \label{eqn:htay} %equation 3
\end{align}
Equation \eqref{eqn:htay} gives the required exponential form for $\mathrm{H}[p(m_{ij}|U,V,\mathcal{D})]$, so now it is possible to integrate w.r.t. posterior $p(U,V|\mathcal{D})$.
\begin{align} %first line of equations
\text{Make things simple looking: } h(\sigma(u_i v_j^T)) &\approx k \exp{[c (u_i v_j^T)^2]} \\ %equation 1
\text{where }& k = \log{2}, c = -\frac{1}{8\log{2}} \nonumber\\ %equation 2
\Rightarrow \mathbb{E}_{p(U,V|\mathcal{D})} \left[ \mathrm{H}[p(m_{ij}|U,V,\mathcal{D})] \right] &= \iint h(\sigma(u_i v_j^T)) p(U,V|\mathcal{D}) \mathrm{d}U \mathrm{d}V \nonumber\\ %equations 3
&= \int h(\sigma(t)) \mathcal{N}(t| \mu_0 \sigma_0^2) \mathrm{d}t \ldots t=u_i v_j^T \nonumber\\
&= k \int e^{ct^2} \frac{1}{\sqrt{2\pi \sigma_0^2}} e^{-\frac{(t-\mu_0)^2}{2\sigma_0^2}} \mathrm{d}t  \nonumber\\
&= k \int \frac{1}{\sqrt{2\pi \sigma_0^2}} e^{-\frac{(1-2\sigma_0^2 c)^2 t^2 - 2\mu_0 t + \mu_0^2}{2\sigma_0^2}} \mathrm{d}t \label{eqn:entropy2t1}
\end{align}
Completing the square in t in \eqref{eqn:entropy2t1} gives:
\begin{align} %first line of equations
\mathbb{E}_{p(U,V|\mathcal{D})} \left[ \mathrm{H}[p(m_{ij}|U,V,\mathcal{D})] \right] &= \frac{\sigma_N k}{\sigma_0} \int \left( \frac{1}{\sqrt{2\pi \sigma_N^2}} e^{-\frac{(t-\mu_N)^2}{2\sigma_N^2}} \right) e^{\frac{\mu_N^2}{2\sigma_N^2} - \frac{\mu_0^2}{2\sigma_0^2}} \mathrm{d}t \nonumber\\ %equation 1
&= \frac{\sigma_N k}{\sigma_0} \times 1 \times e^{\frac{\mu_N^2}{2\sigma_N^2} - \frac{\mu_0^2}{2\sigma_0^2}} \\ 
\text{where }&
\begin{cases} 
\mu_N = \frac{\mu_0}{1-2\sigma_0^2 c} \\
\sigma_N^2 = \frac{\sigma_0^2}{1-2\sigma_0^2 c}
\end{cases} \nonumber\\ %equation 2
\Rightarrow \mathbb{E}_{p(U,V|\mathcal{D})} \left[ \mathrm{H}[p(m_{ij}|U,V,\mathcal{D})] \right] &= \frac{k}{\sqrt{1-2\sigma_0^2 c}} \exp{\left[\frac{-\mu_0^2 c}{1-2\sigma_0^2 c}\right]} \nonumber\\ %equation 3
&= \frac{\log{2}}{\sqrt{1-2\sigma_0^2 c}} \exp{\left[\frac{-\mu_0^2 c}{1-2\sigma_0^2 c}\right]} \label{eqn:entropy2} \\
\text{and }& c = -\frac{1}{8\log{2}} \nonumber
\end{align}
Equation \eqref{eqn:entropy2} is our approximate solution to $\mathbb{E}_{p(U,V|\mathcal{D})} \left[ \mathrm{H}[p(m_{ij}|U,V,\mathcal{D})] \right] $. 

Now it is time to combine the results we have:
\begin{align}
& \quad \mathrm{H} [p(m_{ij} | \mathcal{D})] - \mathbb{E}_{p(U,V|\mathcal{D})} \left[ \mathrm{H}[p(m_{ij}|U,V,\mathcal{D}] \right] \nonumber\\
&= \underbrace{\mathrm{H}[\sigma(\kappa(\sigma_0^2)\mu_0)]}_{H_2} - \underbrace{\frac{\log{2}}{\sqrt{1-2\sigma_0^2 c}} \exp{\left[\frac{-\mu_0^2 c}{1-2\sigma_0^2 c}\right]}}_{H_1} \label{eqn:entropy3}
\end{align}
The last equation \eqref{eqn:entropy3} is the final equation used for a BALD procedure: choose the new entry (i,j) such that \eqref{eqn:entropy3} is maximised. $\mathrm{H}[...]$, in this case, is the binary entropy function. Formulae for $\kappa(), \mu_0, \sigma_0^2$ and $c$ can be found in \eqref{eqn:pm1c} and \eqref{eqn:entropy2}, respectively.

If only $H_2$ is maximised in equation \eqref{eqn:entropy3}, then the algorithm becomes an entropy maximisation (EM) algorithm.

Best Answer

The error is where you use \eqref and \citep in equations:

\text{from } \eqref{eqn:pmuvdef}

should be

\text{from \eqref{eqn:pmuvdef}}

Similarly

\text{from }\citep[\text{eq.}(4.153)]{ref2}

should be

\text{from \citep[eq. (4.153)]{ref2}}
Related Question