Solved – When do Markov random fields $\neq$ exponential families

graphical-modelmathematical-statistics

In their textbook, Graphical Models, Exponential Families and Variational Inference, M. Jordan and M. Wainwright discuss the connection between Exponential families and Markov Random Fields (undirected graphical models).

I am trying to understand better the relationship between them with the following questions:

  • Are all MRFs members of the Exponential families?
  • Can all members from the Exponential families be represented as an MRF?
  • If MRFs $\neq$ Exponential families, what are some good examples of distributions of one type not ncluded in the other ?

From what I understand in their textbook (Chapter 3), Jordan and Wainwright present the next argument:


  1. Say we have a a scalar random variable X that follows some distribution $p$, and draw $n$ i.i.d. observations $X^1, \ldots X^n$, and we want to identify $p$.

  2. We compute the empirical expectations of certain functions $\phi_\alpha%$

    $\hat{\mu}_\alpha= \frac{1}{n}\sum^n_{i=1}\phi_\alpha(X^i), $ for all $\alpha \in
    \mathcal{I}$

    where each $\alpha$ in some set $\mathcal{I}$ indexes a function $\phi_\alpha: \mathcal{X} \rightarrow R$

  3. Then if we force the following two sets of quantities to be consistent, i.e. to match (to identify $p$):

    • The expectations $E_p[(\phi_\alpha(X)]=\int_\mathcal{X}\phi_\alpha(x)p(x)\nu(dx)$ of the sufficient statistics $\phi$ of the distribution $p$

    • The expectations under the empirical distribution

we get an underdetermined problem, in the sense that there are many distributions $p$ that are consistent with the observations. So we need a principle for choosing among them (to identify $p$).

If we use the principle of maximum entropy to remove this undeterminancy, we can get a single $p$:

$\DeclareMathOperator*{\argmax}{arg\,max}
p^* = \argmax_{p\in{\mathcal{P}}} \,H(p)$ subject to $E_p[(\phi_\alpha(X)] = \hat{\mu}_\alpha$ for all $\alpha \in \mathcal{I}$

where this $p^*$ takes the form $p_\theta(x) \propto $ exp${\sum_{\alpha \in \mathcal{I}}\theta_\alpha \phi_\alpha(x)},$ where $\theta \in R^d$ represents a parameterization of the distribution in exponential family form.

In other words, if we

  1. Make the expectations of the distributions be consistent with the expectations under the empirical distribution
  2. Use the principle of maximum entropy to get rid of undetermination

$\rightarrow$ We end up with a a distribution of the exponential family.


However, this looks more like an argument to introduce exponential families, and (as far as I can understand) it does not describe the relationship between MRFs and exp. families. Am I missing anything?

Best Answer

You are entirely correct -- the argument you presented relates the exponential family to the principle of maximum entropy, but doesn't have anything to do with MRFs.

To address your three initial questions:

Can all members from the Exponential families be represented as an MRF?

Yes. In fact, any density or mass function can be represented as an MRF! According to Wikipedia [1], an MRF is defined as a set of random variables that are Markov with respect to an undirected graph. Equivalently, the joint distribution of the variables can be written with the following factorization: $$P(X=x) = \prod_{C \in cl(G)} \phi_C(X_C = x_C)$$ where $cl(G)$ is the set of maximal cliques in $G$. From this definition you can see that a fully connected graph, while completely uninformative, is consistent with any distribution.

Are all MRFs members of the Exponential families?

No. Since all distributions can be represented as MRFs (and not all distributions belong to the exponential family) there must be some "MRF members" that are not exponential family members. Still, this is a perfectly natural question -- it seems like the vast majority of MRFs people use in practice $are$ exponential family distributions. All finite-domain discrete MRFs and Gaussian MRFs are members of the exponential family. In fact, since products of exponential family distributions are also in the exponential family, the joint distribution of any MRF in which every potential function has the form of an (unnormalized) exponential family member will itself be in the exponential family.

If MRFs $\neq$ Exponential families, what are some good examples of distributions of one type not included in the other?

Mixture distributions are common examples of non-exponential family distributions. Consider the linear Gaussian state space model (like a hidden Markov model, but with continuous hidden states and Gaussian transition and emission distributions). If you replace the transition kernel with a mixture of Gaussians, the resulting distribution is no longer in the exponential family (but it still retains the rich conditional independence structure characteristic of practical graphical models).

[1] http://en.wikipedia.org/wiki/Markov_random_field

Related Question