I have a couple things to say.
First, believe your definition of gerbe is slightly incorrect. When you say that your stack is locally isomorphic to $U \times B\mathbb{G}_m$, this isomorphism needs to preserve some additional structure. It might be okay for $\mathbb{G}_m$-gerbes, by accident, but for general non-abelian gerbes you will run into trouble. (It might still be okay for $\mu$-gerbes, where $\mu$ is a sheaf of abelian groups over X).
There are several ways to add this extra structure, but I think the most common are not necessarily the most enlightening. The fact of the matter is that $B\mathbb{G}_m$ is a group object in stacks and it "acts" on the gerbe over $X$. The local isomorphism to $U \times B\mathbb{G}_m$ needs to respect this action. Morally, you should think of a gerbe as a principal bundle with structure "group" $B\mathbb{G}_m$.
The reason that this isn't the most common way to explain what a gerbe is, is that making this precise requires a certain comfortability with 2-categories and coherence equations that most people don't seem to have. Times are changing though. Just as for ordinary principal bundles, you can (in nice settings, say noetherian separated) classify them in terms of Cech data. When you do this you see that the only important part is the coherence data, which gives a 2-cocycle. For non-abelian gerbes you get non-trivial stuff which mixes together parts which look like data for a 1-cocycle and a 2-cocycle. I agree with Kevin that, at this point, if you really want to understand this stuff you should fill-in the rest of the details on your own. It is a good exercise!
Alternatively, if higher categories make you uncomfortable, you can be cleaver. You can still make a definition along the lines of the one you outline precise without venturing into the world of higher categories and "coherent group objects in stacks". I recommend Anton's course notes on Stacks as taught by Martin Olsson. Section 31 has a definition of $\mu$-gerbes which is equivalent to the one I sketched above but avoids the higher categorical aspects. There is also a proof that such gerbes are classified by $H^2(X; \mu)$. Enjoy!
Just to reiterate. In a gerbe you are not patching together classifying spaces, you are patching together classifying stacks. Despite the common notation, there is a difference. A stack is fundamentally an object in a 2-category. This means that you need to deal with 2-morphisms and that they can be just as important as the 1-morphisms. For $B\mathbb{G}_m$, the 1-morphisms (which preserve the multiplication
action of the stack $B \mathbb{G}_m$ !!) are all equivalent, so there is no Cech 1-cocycle data at all. All you get are the coherence data, which form a 2-cocycle.
This is one reason that I prefer the notation $[pt/\mathbb{G}_m]$ to denote the stack $B\mathbb{G}_m$. This is particularly important in the topological setting where these are truly different objects.
Best Answer
There is a canonical equivalence of $2$-categories
$$St\left(Man/M\right) \simeq St\left(Man\right)/M$$ between stacks on the large site of $M$ and stacks on the site of manifolds equipped with a map to $M$ (regarding $M$ as a representable sheaf). Given a map $\pi:\mathscr{Y} \to M$ for $\mathscr{Y}$ some stack on manifolds, it corresponds to the stack $\Gamma(\mathscr{Y})$ on $Man/M$ which assigns a map $f:N \to M$ the groupoid of sections $N \to \mathscr{Y}$ of $\pi$ over $f.$ Suppose that there is a cover of $U_i$ of $M$ such that each $U_i \times _M \mathscr{Y}\simeq U_i \times BU(1)$ (or if you prefer $U_i \times BGL(1)$). Then $\Gamma(\mathscr{Y})$ is easily seen to be a gerbe on the large site for $M$. By Dan Peterson's answer, we see that from the data of a bundle gerbe, one gets a stack $\pi:\mathscr{Y} \to M$ with this property. In fact, it is not hard to show that these are equivalent data, that is, given $\pi:\mathscr{Y} \to M$ such that there is a cover $U_i$ such that $U_i \times _M \mathscr{Y}\simeq U_i \times BU(1)$ is the same as giving a bundle gerbe on $M$. By taking each bundle gerbe $\pi:\mathscr{Y} \to M$ and sending it to $\Gamma(\mathscr{Y})$, one gets a fully faithful embedding of the $2$-category of bundles gerbes over $M$ into the $2$-category of gerbes over the large site of $M$ (which furthermore embeds fully faithfully into stacks on the large site of $M$). The essential image is precisely those gerbes on the site $Man/M$ which are banded by $U(1)$, as pointed out by Reimundo Heluani. It doesn't embed into the $2$-category of stacks on the small site of $M$ however.