Edit (20/11/2013) : Yesterday James Maynard posted the paper Small gaps between primes on the arxiv in which he shows that for any $m$ there exists a constant $C_m$ such that $$ p_{n+m}-p_n\leq C_m$$ infinitely often. More about this result can be found on Terence Tao's blog, or in this expository article by Andrew Granville.
In Goldston, Pintz, and Yildirim paper Primes in tuples I, they show that under the assumption of the Elliott Halberstam Conjecture,
$$\liminf_{n\rightarrow\infty}p_{n+1}-p_n \leq 16$$
and they leave the following question on page 3:
Question 3. Assuming the Elliott-Halberstam conjecture, can it be proved that
there are three or more primes in admissible k-tuples with large enough k? Even
under the strongest assumptions, our method fails to prove anything about more
than two primes in a given tuple.
From what I understand, the issue is increasing a coefficient from $1$ to $2$.
Let $\mathcal{H}=\left\{ 1,\dots,h_{k}\right\}$ be our admissible set, and suppose that $\max_{i}h_{i}\leq x.$ The approach is to look at the sum
$$\sum_{x<n\leq 2x}\left(\sum_{i=1}^{k}\vartheta\left(n+h_{i}\right)-\log(3x)\right)W(n),$$
where $\vartheta(n)=1_{\mathcal{P}}(n)\log n$, $1_{\mathcal{P}}(n)$ is the indicator function for the primes, and $W(n)$ is a positive weight function. If this sum is positive, then one of the terms must be positive, so for some $x<n\leq2x$ we have
$$\sum_{i=1}^{k}\vartheta\left(n+h_{i}\right)>\log(3x),$$
and since $\log(n+h_{i})\leq\log(3x)$ for all $n$ in our range, it follows that there are at least two indices $i\neq j$ such that
$$\vartheta(n+h_{i}),\ \vartheta(n+h_{j})\neq0.$$
Selberg advocated that in general for ease of calculation one should take a positive weight function to be a square, $W(n)=\lambda(n)^{2},$ so the goal is to prove the inequality
$$\sum_{x<n\leq2x}\sum_{i=1}^{k}\vartheta\left(n+h_{i}\right)\lambda(n)^{2}>\log(3x)\sum_{x<n\leq2x}\lambda(n)^{2}$$
for some choice of $\lambda(n).$ In Goldston, Pintz, and Yildirim's paper, they choose
$$\lambda(n)=\frac{1}{\left(k+l\right)!}\sum_{\begin{array}{c}
d|P(n)\\
d\leq R
\end{array}}\mu(d)\log\left(\frac{R}{d}\right)^{k+l}$$
where $P(n)=\prod_{j=1}^{k}\left(n+h_{j}\right)$, and $R$ depends on $x$. To use the same approach for $3$ terms, we would need to examine the sum
$$\sum_{x<n\leq2x}\left(\sum_{i=1}^{k}\vartheta\left(n+h_{i}\right)-2\log(3x)\right)\lambda(n)^{2},$$
and show that
$$\sum_{x<n\leq2x}\sum_{i=1}^{k}\vartheta\left(n+h_{i}\right)\lambda(n)^{2}>2\log(3x)\sum_{x<n\leq2x}\lambda(n)^{2},$$
for a suitable choice of $\lambda(n)$. Increasing the coefficient to a $2$ seems to be a fundamental issue, and hopefully an expert can explain why this is the case.
In the main theorem of this recent paper of Pintz, Zhang's method is used to show that (for $k$ large enough), there are $\gg_{\mathcal H} \frac{x}{\log^k x}$ values of $n \le x$ such that two of the $n+h_i$ are prime and the remaining $n+h_j$ are almost prime (have $O_k(1)$ prime factors, all of which are $\gg x^{c_k}$ for some $c_k>0$ independent of $x$). This is the correct asymptotic (up to multiplicative constants) if one enforces the almost primality conditions, but has the wrong power of log in the denominator if those conditions are removed.
In a slightly different direction, in this paper of Goldston, Pintz, and Yildirim (a followup to their most famous paper), it is shown that for any fixed $\varepsilon>0$, there are $\gg_\varepsilon \frac{x}{\log x}$ primes $p_n$ less than $x$ such that $p_{n+1}-p_n \leq \varepsilon \log p_n$. Again, this is the correct asymptotic up to multiplicative constants.
I have a graduate student who has recently started looking at improvements to these results, with some encouraging preliminary findings, but the work is not yet completed. It is indeed unlikely that one will be able to obtain the optimal asymptotic here unconditionally, but some improvement upon the partial results given above look plausible.
Best Answer
The major difference is the choice of the Selberg Sieve weights. I strongly recommend reading Maynard's paper. It is well written, and the core ideas are nicely explained.
In what follows, I'll give a brief explanation of what Selberg Sieve weights are, why they appear, and what was chosen differently in Maynard's paper compared to that of Goldston Pintz and Yildirim.
Let $\chi_{\mathcal{P}}(n)$ denote the indicator function for the primes. Given an admissible $k$-tuple $\mathcal{H}=\{h_1<h_2<\cdots<h_k\}$ the goal is to choose a non-negative function $a(n)$ so that
$$\sum_{x<n\leq 2x} \left(\sum_{i=1}^k \chi_{\mathcal{P}}(n)(n+h_i)\right)a(n)\geq \rho \sum_{x<n\leq 2x} a(n)\ \ \ \ \ \ \ \ \ \ (1)$$
for some constant $\rho>1$. If this is the case, then by the non-negativity of $a(n)$, we have that for some $n$ between $x$ and $2x$ $$\left(\sum_{i=1}^k\chi_{\mathcal{P}}(n)(n+h_i)\right)\geq \lceil\rho \rceil,$$ and so there are at least $\lceil\rho \rceil$ primes in an interval of length at most $h_k-h_1$. This would imply Zhangs theorem on bounded gaps between primes, and if we can take $\rho$ arbitrarily large, it would yield the Maynard-Tao theorem. However, choosing a function $a(n)$ for which equation $(1)$ this holds is very difficult. Of course we would like to take $$a(n)=\begin{cases} 1 & \text{when each of }n+h_{i}\text{ are prime}\\ 0 & \text{otherwise} \end{cases} ,$$ as this maximizes the ratio of the two sides, but then we cannot evaluate $\sum_{x<n\leq 2x} a(n)$ as that is equivalent to understanding the original problem. Goldston Pintz and Yildirim use what is known as Selberg sieve weights. They chose $a(n)$ so that it is $1$ when each of $n+h_i$ are prime, and non-negative elsewhere, in the following way:
Let $\lambda_1=1$ and $\lambda_d=0$ when $d$ is large, say $d>R$ where $R<x$ will be chosen to depend on $x$. Then set $$a(n)=\left(\sum_{d|(n+h_1)\cdots(n+h_k)} \lambda_d\right)^2.$$ By choosing $a(n)$ to be the square of the sum, it is automatically non-negative. Next, note that if each of $n+h_i$ is prime, then since they are all greater than $x$, which is greater than $R$, the sum over $\lambda_d$ will consist only of $\lambda_1$. This means that $a(n)=1$ when each of the $n+h_i$ are prime, and hence $a(n)$ satisfies both of the desired properties. Of course, we do not want $a(n)$ to be large when none, or even very few of the $n+h_i$ are prime, so we are left with the difficult optimization problem of choosing $\lambda_d$. The optimization in the classical application of the Selberg sieve to bound from above the number of prime-tuples involves diagonalizing a quadratic form, leading to
$$\lambda_d = \mu(d) \left(\frac{\log(R/d)}{\log R}\right)^k,$$
and so $$a(n)=\left(\sum_{\begin{array}{c} d|(n+h_{1})\cdots(n+h_{k})\\ d<R \end{array}}\mu(d)\left(\frac{\log(R/d)}{\log R}\right)^{k}\right)^{2}.$$
This choice is not so surprising since the related function $\sum_{d|n}\mu(d) \left(\log(n/d)\right)^k$ is supported on the set of integers with $k$ prime factors or less. Goldston Pintz and Yildirim significantly modified this, and used a higher dimensional sieve to obtain their results.
The critical difference in the approach of Maynard and Tao is choosing weights of the form
$$a(n)=\left(\sum_{d_{i}|(n+h_{i})\ \forall i}\lambda_{d_{1}\cdots d_{k}}\right)^{2}.$$ This gives extra flexibility since each $\lambda_{d_1,\dots,d_k}$ is allowed depend on the divisors individually. Using this additional flexibility Maynard and Tao independently established equation $(1)$ for any $\rho>1$, with $k$ depending on $\rho$.
Here are four great resources where you can read more: