One crude answer is that passing to derived functors fixes one obstruction to being an equivalence. Any equivalence of abelian categories certainly is exact (i.e. it preserves short exact sequences), though lots of exact functors are not equivalences (for example, think about representations of a group and forgetting the G-action).
What derived functor does is fix this problem in a canonical way; you have to replace short exact sequences with exact triangles, but you get a functor which is your original "up to zeroth order," exact, and uniquely distinguished by these properties.
So, what BMR do is take a functor which is not even exact (and thus obviously not an equivalence), and show that the lack of exactness is "the only problem" for this being an equivalence.
EDIT: Let me just add, from a more philosophical perspective, that derived equivalences are just a lot more common. There are just more of them out in the world. Given an algebra A, Morita equivalences to A are classified essentially by projective generating A-modules, whereas derived Morita equivalences of dg-algebras are in bijection with all objects in the derived category of $A-mod$ which generate (in the sense that nothing has trivial Ext with them): you look at the dg-Ext algebra of the object with itself. If you have an interesting algebra (say, a finite dimensional one of wild representation type), there are a lot more of the latter than the former in a very precise sense. Of course, the vast majority of these are completely uncomputable an tell you nothing, but there are enough of them in the mix to make things interesting.
[Edited to reflect Reimundo's comment]
The question addresses categorified versions of the Borel-Weil-Bott theorem (and more generally Beilinson-Bernstein localization), which states
an equivalence between G-equivariant vector bundles on the flag variety - aka vector bundles on pt/B (modulo an action of the Weyl group - aka double cosets B\G/B - by intertwiners) and algebraic representations of G. There are two pieces of content here: first, that all representations can be realized on G/B, ie representations have highest weights, and second that irreducibles correspond to line bundles, ie their highest weight spaces are one-dimensional. The first has an analog for any representation of the Lie algebra: Beilinson-Bernstein's localization can be rephrased as simply asserting that descent holds from twisted D-modules on the flag variety to representations of the Lie algebra.
I don't know anything about the analog of the second assertion for categorified representations - ie to what extent "indecomposable" representations of some kind are induced from "one-dimensional ones" (ie from gerbes on homogeneous spaces) - except to point out a very nice paper by Ostrik (section 3.4 here) in which analogous results are proved for the case of a finite group.
As for the "descent" (first) part of BWB, it becomes completely trivial once categorified, if we consider so-called algebraic (or quasicoherent) actions of G on categories (equivalently module categories for quasicoherent sheaves on G). In fact the same assertion holds for ANY algebraic subgroup of G, not just a Borel, in sharp distinction to the classical setting: algebraic G-actions on categories are generated by their H-invariants for any H in G! More precisely we have the following theorem:
Passing to H-invariants provides an equivalence of $(\infty,2)$-categories
between (dg) categories with a G action and categories with an action
of the "Hecke category" QC(H\G/H) of double cosets.
This is a theorem of mine with John Francis and David Nadler in a preprint that's about to appear (copies available).. it's a version of a well known result of Mueger and Ostrik in the finite group case, and is an easy application of Lurie's Barr-Beck theorem. In fact if we use a result of Lurie in DAG XI, that there is no distinction between quasicoherent sheaves of categories on stacks X with affine diagonal and simply module categories over QC(X), we can rephrase the result as follows:
G-equivariant quasicoherent sheaves of (dg-)categories on the flag variety equipped with a "categorified Weyl group action" (module structure for QC(B\G/B) ) are equivalent (as an $(\infty,2)$-category) to (dg)-categories with algebraic G-action.
On the other hand things get much more interesting if we consider "smooth" or "infinitesimally trivialized" G-actions (module categories over D-modules on G) (also discussed in the references you provide). The fundamental example of such a category is indeed Ug-mod, the category of all representations of the Lie algebra, or equivalently (up to some W symmetry) the category D_H(G/N) of all twisted D-modules on the flag variety.
In this case my paper with Nadler "Character theory of a complex group" (and other work in progress) precisely studies the full sub(2)category of smooth G actions which ARE generated by their highest weight spaces, ie which do come via a Borel-Weil-Bott type construction (or equivalently, the full subcategory generated by the main example Ug-mod). And not all smooth G-categories are of this form, though one might hope that this is the case in some weaker sense.. in any case it seems that all G-categories of interest in representation theory do fall under this heading. But in any case I don't know a BWB type statement in this setting.
Best Answer
Well, I don't call its derived version "Koszul duality." Koszul duality is a relationship between pairs of graded algebras (you could upgrade this to graded categories, but why bother?). Given a positively graded algebra A, consider the zero-degree part A0 as a module. Then B=Ext*A(A0,A0) is a new bigraded algebra: it has a homological grade, and one induced by taking a graded free resolution of A0.
If these coincide, we say that A is Koszul and B is its Koszul dual. The functor of Ext*A(A0,-) induces an equivalence between the derived categories of graded modules (with strange behavior on the grading!) over A and B (there's a version for ungraded modules of A, but those are sent to dg-modules of B and vice versa).
One of the strange facts of geometric representation theory is that categories which seem to a priori have no good reason to be Koszul dual actually are. For example, the categories of (g,N)-admissible modules with fixed generic central character for one Lie algebra is Koszul dual to the corresponding category for its Langlands dual. Soergel's conjecture is an version of this statement to (g,K) admissible modules for various symmetric subgroups K (which is much harder than the version for N).
What this has to do with Beilinson-Bernstein is that all these theorems are most naturally studied from a geometric perspective. For example, Koszulity tends to be related to purity of intersection cohomology.