Geant is a framework---which means that you use it to build applications that simulate the detector and physics you are interested in. The simulation can include all of physics and the complete detector including electronics and trigger (i.e. you can write your simulation so that it output a data file that looks just like the one you are going to get from the experiment1).2
The various parts of Geant are validated by being able to correctly predict the outcomes of experiments. Particular models are tuned on well known physics early in the analysis of the data. This allows you to get simulated optical properties, detector gains and so on correctly matched to the actual instrument.
Geant is also heavily documented. Read the introduction and the first two chapters of the User's guide for Application Developers, which will give you the basics. After that you can delve into the hairy details in the Physics and Software references. There is much, much too much to cover in a Stack Exchange answer. (I mean literally....if I tried I'd end up overrunning the 32k characters per post limit.)
It helps to know that Geant4 derives from Geant3 and earlier efforts. This thing has a history that goes back for decades and has been tested in thousands of experiments large and small.
The use in the Higgs search goes something like this
- We have a theory--the Standard Model--which tells us what coupling to expect for the particle we hope to detect
- We write (and test) a Geant physics module implementing those physics. Maybe more than one. We may need to write a new event generator or tweak an existing one in parallel to this effort.
- You construct a geant simulation of your detector. You include a simulation of the electronics, trigger and so on.3
- You simulate a lot of data from the desired channel and from possible interfering channels (including detector noise and backgrounds). You're going to use a cluster or a grid for this, because it is a big problem
- You combine this simulated data.
- You run your analysis on the simulated data.4
- You extract from these results an "expected" signal.
Actually, you did all of the above at lower precision several times during the design and funding phase and used those result to determine how much data you would have to collect, what kinds of instrumentation densities you needed, what data rate you had to be able to support and so on ad nauseum.
Once you have got the data, you start by showing that:
You can detect lots of well known physics in your detector (to validate the detector and find unexpected problems)5
That your model correctly represents the detector response to that well known physics (to let you debug and tune your model)
Then you may need to re-run some of the "expected" processing.
Only then can you try to compare data to expectation.6
1 Indeed the data format is often thrashed out and debugged from the MC before the experiment is even built.
2 For big, complicated experiments like those at the LHC Geant is usually paired with one or more external event generators. In the neutrino experiments I'm currently working on that means Genie and Cry. Not sure what the collider guys are using right now.
3 For speed reasons we often simulate the electronics and trigger outside of Geant proper, but this decision is made on a case by case basis.
4 Indeed the analyzer is often programmed and debugged from the MC output before there is real data.
5 This is also where most of the actual repetition of results in the particle physics world comes from. You won't get funding to repeat BigExper's measurement of the WingDing Sum Rule, but if your proposed NextGen spectrometer can do that as well as your spiffy New Physics (tm) it helps your case with the funding agencies.
6 Many of these steps will be done by more than one person/group in the collaboration to provide copious cross-checks and protection against embarrassing mistakes. (See also, OPERA's little issue last year...)
I know that in Full Configurational Interaction Quantum Monte Carlo(FCIQMC), where they start from the Schrödinger equation and sample the full configurational space with integer walkers, there is a spontaneous symmetry breaking between the $\Psi$ and $-\Psi$ after a sufficient number of walkers are spawned into the configurational space. It treats the strongly correlated systems quite well. For a brief introduction, you can have a look at the first link and also this paper.
Unlike in Diffusion Quantum Monte Carlo where you need to use fixed-node approximation to fix the sign of the wavefunction, FCIQMC provides a "phase transition" way to tackle the sign problem. However, FCIQMC scales exponentially with the system size, that's the major problem in this method.
There is also this coupled cluster theory, which I am now studying, but, for now, I cannot say any useful information about it. I will probably update in the future. The only thing I know is that it has comparable performance in including the correlations, but scales less severely as the system size as FCIQMC.
Best Answer
One has to realize that a Monte Carlo simulation is an integration tool. Suppose you have a curve in an xy plot, y=f(x). If you throw random (x,y) pairs in the square containing the f(x) and count the number where y is less than f(x) versus the number y larger than f(x) you get an estimate of the area under f(x), i.e. the integral of the function.
In elementary particle physics, the phase space ( equivalent to the square in the simple example) is known. Theoretical functions are used as a weight to a random number generator, to generate "events" according to their parameters and checked against the real event data. If the fit is bad, the parameters are changed to improve it.
The advantage is
a)Detector limitations can be programmed in the phase space and events generated with the limits of the detector included
2) The method is much more efficient in computer time than the numerical integrations necessary over the innumerable functions entering the problems, detector and theory.
3) Once a Monte Carlo event sample is generated it can be used "as if it is data" over and over again to get plots not thought up beforehand.
In the recent LHC experiments the Monte Carlo events were generated way before the real data, according to the detector limitations and to the theoretical expectations from the Standard Model. The existence of these Monte Carlo data set allowed fast checks on whether new physics is appearing. New physics will appear as statistically significant deviations from the Monte Carlo curves.