It's not clear what you mean by "various refinements and generalizations". Cerf has a huge paper published by IHES "Topologie de certains espaces de plongements" which goes into many related details. In a way it's more of a ground-up collection of basic information on the topology of function spaces.
Regarding your 2nd question, if instead of demanding a fibre bundle you ask for a Serre fibration, the proof is relatively simple. It's just the isotopy extension theorem with parameters, and the proof is pretty much verbatim Hirsch's proof of isotopy extension in his "Differential Topology" text plus the observation that solutions depend smoothly on the initial conditions.
Regarding your 2nd question, yes of course. Palais's paper is quite nice. If you haven't had a look at it, you might as well try -- it's only 7 pages long. If you want to discover the proof on your own I'd start with the case $S$ a finite set. Then move up to $S$ a positive-dimensional submanifold. You'll want to be comfortable with things like the proof of the tubular neighbourhood theorem, the concept of injectivity radius, etc.
Note that it is not hard to construct examples where $Br(G)$ is not equal to $M(G)$. To see this, let $H_1=\mathbb{Z}/p\times\mathbb{Z}/p$, and let $\alpha_1\in H^2(H_1,\mathbb{C}^*)=\mathbb{Z}/p$ be a generator. The class $\alpha_1$ is represented by a projective representation $p_1:H_1\rightarrow PGL_p(\mathbb{C})$. Now, let $H_i=(H_1)^i$, and let $\alpha_i$ be the Brauer class of the projective representation $$p_i:H_i=(H_1)^i\xrightarrow{p_1\times\cdots\times p_1}(PGL_p(\mathbb{C}))^i\rightarrow PGL_{p^i},$$ where the final arrow is given by embedding $PGL_p(\mathbb{C})^i$ into $PGL_{p^i}$ via block matrices down the diagonal. Then, it is not hard to show that the index of $\alpha_i$ is $p^i$. Recall that the index of a class of $Br(G)$ is the least common divisor of all projective representations having that class.
Now, define $G=*_i H_i$, the free product of the $H_i$ for all positive integers $i$. The corresponding classifying space $BG$ is the wedge sum of the $BH_i$. Therefore, the collection of continuous pointed maps
$$\alpha_i:BH_i\rightarrow K(\mathbb{C}^*,2)$$
induces a continuous pointed map
$$\alpha:BG\rightarrow K(\mathbb{C}^*,2)
$$such that the composition $$BH_i\rightarrow BG\rightarrow K(\mathbb{C}^*,2)$$ is $\alpha_i$.
It is easy to see that $ind(\alpha_i)$ divides $ind(\alpha)$ for all $i$. But, since $ind(\alpha_i)=p^i$, it follows that $ind(\alpha)$ is infinite, and hence that $\alpha\notin Br(G)$, as desired.
Obviously, $G$ is not finitely presented. In relation to André's comment above, it seems likely that it is not linear either.
Best Answer
Historically the first version is the nerve of a covering, which has been used in the works of P. S. Aleksandrov in late 1920-s. The nerve of a covering in that version was treated as a simplicial complex had the elements (which are some open sets) of the covering as vertices, and an $n$-simplex is corresponding to an $(n+1)$-tuple of elements of the covering which have a common nonempty intersection; in particular one gets a finite combplex for a finite covering. This version was soon later used in Čech theory. I emphasize this as often nowdays the Vietoris complex which is a bigger complex whose vertices are pairs $(U,x)$ where $U$ is an open set and $x\in U$ is nowdays often called Čech complex as well, as the finite, original, version is now more rarely used. Simplicial sets replaced old-fashioned simplicial complexes a couple of decades later.
Grothendieck generalized the nerve to the case of categories. The simplicial complexes in their combinatorial and topological reincarnation were from the beginning taken interchangeably. However, for simplicial sets, the nice categorical treatment is from Milnor, who formally introduced a notion of geometric realization in modern context; the concept was essentially known but not its properties at the time. Classifying spaces for group case, were of course studied first in the context of group cohomology, so MacLane is probably among the first ones using it. Segal in late 1960s, not only studied the concept in depth but also introduced more complicated version for simplicial categories.