The rule of thumb is this: Your DM (or Artin) stack will be a sheaf in the fppf/fpqc topology if the condition imposed on its diagonal is fppf/fpqc local on the target ("satisfies descent").
In other words, in condition 2 you asked that the diagonal be a relative scheme/relative algebraic space perhaps with some extra properties. If there if fppf descent for morphisms of this type (e.g., "relative algebraic space", "relative monomorphism of schemes"), you'll have something satisfying fppf descent. If there is fpqc descent for morphisms of this type (e.g., "relative quasi-affine scheme"), then you'll have something satisfying fpqc descent.
See for instance LMB (=Laumon, Moret-Bailly. Champs algebriques), Corollary 10.7. Alternatively: earlier this year I wrote up some notes (PDF link) that included an Appendix collecting in one place the equivalences of some standard definitions of stacks, including statements of the type above.
I will try to answer this question in a way relevant to more than one field, however, to be honest, I'm rather unconventional in the sense that my experience in this area stems from topological and differentiable stacks rather than algebriaic ones. However, from a formal view point, everything is the same.
So let us work in a "background Grothendieck site", which can be topological spaces, differentiable manifolds, or schemes over a fixed base (the first two with the "open cover topology"). Let's call an object in category a "space".
If G is a group object, and X is a space with an action of G, we can take the corase qoutient. However, this is generally not a "nice space" in the senes that the quotient loses a lot of information about the action. In the context of topological spaces, a "nice quotient" would be one that makes the map $X \to X/G$ into a principal G-bundle. However, you need some really nice conditions on the action for this to work in general. E.g., the action needs to be free.
Note, if we consider the projection $X \to X/G$ "coming from the left and the top" and take the pullback, the action is free if and only if the pullback is $G \times X$.
Now, from G acting on X, we can construct the so-called "action groupoid", which has objects X, and arrows $G \times X$, where $(g,x):x \to gx$. This is a groupoid object in spaces, denote it by Act_G(X). Given a space T, we can pretend it's a groupoid object, with all idenity arrows. We can consider Hom(T,Act_G(X)), where the Hom is taken in the 2-category of groupoid objecs, hence, this Hom gives a groupoid, not just a set (the 2-cells are internal natural transormations). The assignment $T \mapsto Hom(T,Act_G(X))$ defines a presheaf of groupoid on spaces. Moreover, there is a canonical morphism $X \to Hom(Blank,Act_G(X))$ of presheaves of groupoids (where X is identified with its representable presheaf). If form a weak 2-pullback by having this morphism "coming from the left and the top", the pullback becomes $G \times X$, one projetion becomes the "source map" and another the "target map". If we say that $Hom(Blank,Act_G(X))$ is our new qoutient, then "the action becomes weakly free".
So far, everything I did was using groupoids. So where to stacks enter the game? Well, $Hom(Blank,Act_G(X))$ is not a very good quotient because if Y is another space, maps from Y to it don't see $Hom(Blank,Act_G(X))$ as "being like a space". E.g. if we are in topological spaces, we can't define maps from Y into it by defining them on the opens of Y in a way that agrees. (For more explanation see my answer to Stacks in the Zariski topology?). What we have to do is "stackify" the presheaf of groupoids $Hom(Blank,Act_G(X))$, (call its stackification X//G). This makes X//G behaves like a spacein the sense that, e.g. in topological spaces, we can defined maps into it by mapping out of opens in a way that agrees. Since stackification preserves finite weak 2-limits, if we form the same pullback diagram but insetad with respect to $X \to X//G$, we still recover the action grouoid and the action is still "weakly free". Morevoer, the projection $X \to X//G$ becomes a G-torosor (principal G-bundle).
So, just using the groupoid, allowed us to keep track of the isotropy data, but not in a way that we get something like a space. For that we need stacks.
If instead of using the groupoid $Act_G(X)$, we used any groupoid object, we can still stackify its associated presheaf of groupoids. The stacks we get in this way are "geometrical", and give rise to topological, differentiable, and Artin stacks respectively.
A final remark. In the comments, it was said that in some sense groupoids are "atlases for stacks". To see this, let's go to manifolds. Given a manifold M described in terms of an atlas, we can construct a Lie groupoid whose objects are the disjoint union of the elements of the atlas, and where we have an arrows from (x,U_a) to (x,U_b) whenever x is in the intersection of these two. This Lie groupoid's associated stack is the same as the manifold M. More generally, given an orbifold described in terms of charts, we can also construct a Lie groupoid with respect to these charts, and its associated stack "represents the orbifold". In general, you can think of Lie groupoids as "generalized atlases" which describe the geometric object which is their associated stack. Of course, just as a manifold can be described by more than one atlas, a differenitbale stack can be described by more than one Lie groupoid.
Best Answer
In order to do geometry, you need to have some kind of global structure which has good local models (the "neighborhoods") and good gluing conditions. In algebraic geometry, the good local models are rings. If you want do geometry with a fibered category, you need gluing conditions (that is, you need your fibered category to be a stack), and you need local models, that is, you need your category to be locally, in some pre-topology, an affine scheme (this is not quite right, but I hope it gives a rough idea). The pre-topology must be such that if $X \to Y$ is a covering, the fact that $Y$ has certain "interesting" local properties implies that $X$ also has them. Étale coverings work very well, of course; smooth coverings also work, not quite as well.
So, you can't do geometry with the stack of coherent sheaves because this does not have good neighborhoods. See also my answer to Qcoh(-) algebraic stack? to see what can wrong.
As to why algebraic stacks are always assumed to be stacks in groupoids, there are several things I could say, but the honest answer is that I don't know the deep reason for this. I know that in practice it suffices, so there is no reason to give up the inversion map, which is quite useful. Just think of how much more you can say about group actions, than about actions of monoids.
Of course, this does not mean that in the future people will not feel the need to extend the theory of algebraic (or topological, or differentiable) stacks to the more general case.
[Edit]: So, why is a geometric stack a stack in groupoids? Well, the first reason is that the inversion map is very useful in proving results. Of course, if we needed to do without it, we would.
The second, more serious, reason, is that, in concrete examples, stacks with non-cartesian maps tend not to admit non-trivial map to spaces. For example, consider the stack $\mathcal M_{1,1}$ of elliptic curves. If we admitted all squares as morphisms, instead of only the cartesian ones, any map from $\mathcal M_{1,1}$ to a space would have to collapse an isogeny classes of curves to a point, and then one can see that it would map everything to a point. So, no moduli space.
As another example, take the stack of vector bundles on a projective variety $X$. There is a map between any two vector bundles, so no open substack could possibly admit a non-trivial map to a space.
Of course, if $F$ is a stack over a site $C$, there is substack $F^*$ with the same objects, whose arrows are the cartesian arrows in $F$; and if $X$ is an object of $C$, or a sheaf on $C$, any cartesian functor $X \to F$ would factor through $F^*$; so you could argue that a chart for $F$ would in fact come from a chart for $F^*$. In all the examples I know, $F^*$ is the right object to consider.
But, once again, none of these reasons is really compelling; for example, if monoid actions became important in geometry, I would bet that soon people would start working with geometric stacks that are not stacks in groupoids.