Sticking with Rudin, try his "Functional Analysis", section 5.7, for a very slick proof (due to Glicksberg).
We can look at the proof to see when equality occurs.
The convexity of $\phi$ gives us
$$\lambda := \sup_{a < s < c} \frac{\phi(c)-\phi(s)}{c-s} \leqslant \rho := \inf_{c < t < b} \frac{\phi(t)-\phi(c)}{t-c}$$
for every $c\in (a,b)$. Letting $c := \int_X f\,d\mu$, it follows that for every $\kappa \in [\lambda,\rho]$ and $t\in (a,b)$ we have
$$\phi(t) \geqslant \phi(c) + \kappa\cdot (t-c)\tag{1}$$
and hence
$$\phi(f(x)) \geqslant \phi(c) + \kappa\cdot \bigl(f(x) - c\bigr)\tag{2}$$
for every $x\in X$. Integrating $(2)$ gives Jensen's inequality, and it follows that we have the equality
$$\int_X \phi\circ f\,d\mu = \phi\left(\int_X f\,d\mu\right)$$
if and only if we have equality a.e. in $(2)$.
That we have equality a.e. in $(2)$ means that $\phi$ coincides with an affine function on [the convex hull of] the essential range of $f$, but $\phi$ need not be affine globally on $(a,b)$.
If $f$ is essentially constant, that is no restriction on $\phi$, then equality holds in Jensen's inequality for all $\phi$. If $\phi$ is affine, we have equality for all $f$. If $\phi$ is strictly convex, Jensen's inequality is strict for all $f$ except the essentially constant ones, but if $\phi$ is not strictly convex, equality also holds for some (essentially) non-constant $f$.
Best Answer
(OP likely understands it by now, but here are some more details to copper.hat's comment for future readers.)
By Theorem 1.39, $\int_{\Omega} f \, d\mu = 0$ implies that $f = 0$ a.e.
If $\int_{\Omega} f \, d\mu = a$, then $$ \int_{\Omega} (f - a) \, d\mu = \int_{\Omega} f \, d\mu - \int_{\Omega} a \, d\mu = \int_{\Omega} f \, d\mu - a \mu(\Omega) = \int_{\Omega} f \, d\mu - a = 0, $$ so that we may conclude that $f - a = 0$ a.e. or $f = a$ a.e., which is in contradiction with the fact that the range of $f$ is in $(a,b)$.
The proof for $b$ is similar.