String theory is a perturbation theory of quantum gravity starting with perfectly linear Regge trajectories self-interacting in a consistent bootstrap. Bootstrap means that the interaction of the trajectories is only by exchange of other trajectories, so that the system is self-consistent, or, in 1960s terminology, that it pulls itself up by its own bootstraps.
The best way to learn what string theory is, is to get a copy of Gribov's "The Theory of Complex Angular Momentum", and learn the basic principles of Regge theory. You don't have to learn the Reggeon calculus covered later (although it is interesting), just the basic principles. The point of this theory is to understand spectral properties --- S-matrix states, not detailed microscopic field theory, which breaks down at the Planck scale. The S-matrix is valid at any scale, it is the fundamental observable object in relativistic quantum mechanics, when you don't have point probes.
In QCD, you can make little black holes and use them as point probes. You can even use electrons as point probes, without going to the trouble of making a black hole. This shows that the scale of QCD is not the appropriate scale for string theory, but nevertheless, string theory was discovered by trying to find a consistent bootstrap at this scale. This is very fortunate historically, and required vision and persistance.
Linear Regge trajectories can be understood as string-like excitations. They are the quantized states of an extended relativistic string described by a Nambu Goto action. But the string action by itself doesn't tell you anything about interactions. The interactions of these objects is only by exchanging states found in their own spectrum, with the condition that S-channel exchange is dual to T-channel exchange. They have no other interactions. This defines a dual string theory, the kind people study.
States and observables
String theory is a quantum mechanical S-matrix theory, so the state variables (in an asymptotically flat space) are the following:
- A classical configuration of fields at infinity, which defines the background.
- A finite number of incoming particles in a quantum superpostion of plane waves, which define the in-state at minus infinity.
The dynamical law is to produce the out-state, given these ingredients. There is only one observable,
- The S-matrix between in and out states.
That's it. Every other observable has to be extracted from this one, by some trick. This is extraordinarily difficult, but in practice, there are some simplifications.
- The classical field configuration is a background which must be consistent with the string theory itself. This gives classical equations which the background field must satisfy. These come from the condition that the string is still conformally invariant in the background, so that the $\beta$ function is zero. These equations define the classical allowed backgrounds, and from this you can extract the classical dynamics at infinity, which is what you use almost all the time, or the approximate quantum field theory description, which is what you use almost all the rest of the time.
- You can take a field theory limit, and start telling quantum field stories of particle propagation. Almost all the work in the 1980s was based on low-energy supergravity approximations, sometimes with higher order effective action corrections. Such a description can be thought of as an approximation scheme to string theory, by adding higher order string corrections to a quantum field theory, to get a string-corrected effective action. This way of dealing with strings is philosophically least challenging, but it doesn't seem to me to be a very convergent process. The string theory is not a quantum field theory, after all.
More dynamical formulations
There are more dynamical formulations than the S-matrix theory, and more honest formulations than the "effective-action" string-corrected quantum field theory. These honest formulations are due to Mandelstam, Kaku-Kikkawa, Banks-Fischler-Susskind-Shenker, and Maldacena-Witten-Gubser-Klebanov-Polyakov.
- When you absolutely need to use string theory itself, instead of quantum field theory with effective action corrections, you can move to a dynamical picture where the string tells a story which is local in a version of spacetime. In such a picture, the string theory can be thought of as a normal quantum theory, not an S-matrix theory. The states are defined by superpositions of configurations, jsut like any other quantum theory. The Mandelstam description of strings is one such picture, and because it exists, one could go to a second quantized string field theory, by defining creation and annihilation operators for the string states. So string field theory de-S-matrixes the S-matrix theory. But it is defined on a light cone, and it is technically complicated. But in such a picture, the basic states of string theory are quantum superpositions of light-cone configurations of strings.
- In 11-dimensional matrix theory, you have a point-particle description in which you can define the state space and evolution again like any other quantum field theory--- as superpositions of noncommutative matrix model configurations, with a normal quantum dynamics in 0+1 dimensions. This is the easiest state space to formulate.
- In asymptotically AdS backgrounds, instead of incoming/outgoing particles and an S-matrix, you have a full honest to goodness quantum field theory's worth of information at the boundary of space-time. This quantum field theory maps in a not-completely-understood way to the interior description, but the state-space and dynamics are obvious--- they are just like any other quantum field theory.
Notice that the descriptions of the state space is entirely different in the different formulations! This is important, both because their mutual self consistency is an insanely stringent consistency constraint, which has zero chance of being satisfied unless there is a consistent gravitational theory behind it, and also it gives completely different types of observables in different asymptotic backgrounds.
There is no unified way of defining the state space on all backgrounds at once. Each type of background has its own formulation. It requires physical intuition to move between the pictures, and there is no way to communicate the results to a mathematician without communicating the physical picture, because the theory isn't 100% complete.
Literature and Misconceptions
The best review for me was Mandelstam review from 1974. It is very important to learn these old-fasioned ideas, because otherwise you will have all sorts of nonsense in your head about what you can do to strings.
- Dual strings do not describe statistical polymer properties. It doesn't work, they aren't those types of strings. Polymers interact by self-intersection, strings don't.
- Dual strings don't work to describe vortex lines in a quantum field theory, although this idea was one of the ways in which they were discovered, by Nielsson. The vortex line picture was simultaneous with the flux-line picture, but the flux line picture is now known to be correct, while the vortex line picture, as far as I know, has no precise version beyond that the string is extended in 1d, like a vortex line. If you make an effective theory of vortex lines in a scalar/gauge field theory, they will interact in crazy non-string ways. In gauge theories with a gravitational dual, like N=4 gauge theory, you probably can make the string be a vortex line, so Neilsson's idea is not altogether wrong (I think), but then some duality will have to link up the vortex and flux line.
- Dual strings have no deformations: you can't make dual strings interact at collision points, you can't make them attach or detach other non-string objects. They are either a theory of everything or a theory of nothing.
- Dual strings only allow S-matrix probing. You can't calculate off-shell behavior, meaning you can't describe their detailed dynamics in space-time. You can formulate their world-sheet behavior in space and time. There is string field theory, which attempts to take strings off shell, but now we know the right way to do this is AdS/CFT, although string field theory is still very important.
- The only way to touch strings to classical objects is to fiddle with asymptotic values of fields. You can't probe them using local quantum fields, because they generate their own local fields. You can make them move in a classical background, but their dynamics determines the quantum properties of all the backgrounds.
- You can also make them interact with branes, but this is a surprise, and it only works because the branes are weak dual black holes, where strings can partially fall through. It doesn't work for arbitrary surfaces, and there are strong constraints on which branes are allowed.
If you learn the old-fasioned string theory of the 1960s and 1970s, you can understand the rest of the stuff. If you don't, you can't.
Best Answer
matrix string theory may be viewed just as a variation of BFSS Matrix Theory, although arguably an important one, and the original papers are the full introductions at the same moment.
Some of the few hundred followups deal with some more technical issues.
The last paper in the list above, which is the newest one, should be the most optimized one. To say the least, it contains the most detailed treatment of the interactions. One may enumerate a couple of reviews of BFSS Matrix Theory. Some of them dedicate some time to matrix string theory, some of them don't. For example, see
A derivation of BFSS Matrix Theory was given by Seiberg:
M-theory in 11 dimensions may be compactified on a nearly light-like (slightly space-like) circle - which is still consistent. $X^-$ becomes a periodic variable, $X^-\approx X^-+2\pi R$. (This light-cone treatment was automatically used in my paper above but it was Lenny Susskind who took credit for it months later - much ado about nothing. The original BFSS paper was using the "infinite momentum frame".) In the lightlike limit, a Lorentz boost may map the compactification to a compactification of M-theory on a very short spatial circle in 11D Planck units (because the proper length of the nearly light-like circle was tiny) - which is type IIA string theory. Units of momenta along the compact light-like direction become D0-branes.
The kinematic regime guarantees that these D0-branes are non-relativistic. They're well-described by the non-relativistic supersymmetric quantum mechanics - the matrix model - which is the dimensional reduction of the 10D supersymmetric Yang-Mills theory to 0+1 dimensions. The gauge group is $U(N)$. It has 16 non-trivial real supercharges.
So one can show that all of physics of M-theory, if studied in the light-cone gauge, is equivalent to an ordinary non-gravitational matrix model - a quantum mechanical model with matrix degrees of freedom. The eigenvalues of the $X^i$ matrices may be viewed as the positions of the gravitons (or their superpartners) in 11 dimensions; a threshold (zero-binding-energy) bound state of several such eigenvalues (which can be proved to exist, a remarkable property of $SU(N)$ supersymmetric quantum mechanics) are gravitons that carry a higher number of units of the quantized light-like (longitudinal) momentum.
All interactions are encoded in the off-diagonal elements of the matrices which are classically zero but whose virtual quantum effects make the eigenvalues interact so that the resulting picture is indistinguishable from 11D supergravity at low energies; much like AdS/CFT, it is an equivalence of a gravitational theory and a non-gravitational one (in some sense, the compact light-like direction $X^-$ of the matrix model is the holographic direction). The model contains black holes and all other expected objects, too: extended branes may be added. The identical natural of gravitons and gravitinos - with the right Bose-Einstein and Fermi-Dirac statistics - appears because the permutation group is embedded into the $U(N)$ gauge group of the quantum mechanical model, and all physical states must therefore be invariant under this $U(N)$ i.e. also $S_N$. The compact M2-branes (membranes) appear most directly because the whole BFSS matrix model may be viewed as a discretization of the M2-brane world volume theory in M-theory - assuming that the world volume coordinates generate a non-commutative geometry. This equivalence may be derived in a straightforward way, especially for the toroidal and spherical topology of the M2-branes. M5-branes are harder to see but they must be there, too.
The BFSS Matrix Theory above gave the first complete definition of M-theory in 11 dimensions (the whole superselection sector of the Hilbert space) that was valid at all energies. It's a light-cone-gauge description where sectors with different values of $p^+ = N/R$ are separated and separately described by the $U(N)$ quantum mechanical models. I forgot to say - to really decompactify the $X^-$ coordinate, one needs to send its radius $R$ to infinity. Because $p^+=N/R$ is fixed (physical momentum), $N$ has to be sent to infinity, too. The infinite-space physics is always obtained as the large $N$ limit of calculations in $U(N)$ matrix models.
Matrix string theory
One may apply the same derivation to find the matrix model of other superselection sectors besides the 11D vacuum of M-theory, too. It includes some (simple) compactifications; the right matrix model isn't known for all compactifications. In particular, matrix models for type IIA string theory and heterotic $E_8\times E_8$ string theory have a very simple form. Instead of a quantum mechanical model i.e. 0+1-dimensional field theory arising from the D0-branes, one ends up with a 1+1-dimensional supersymmetric gauge theory originating from D1-branes of type IIB (an extra T-duality is added to the derivation), compactified on a cylinder, the so-called matrix string theory (although the historically more correct name is "screwing string theory").
In matrix string theory, again, the eigenvalues of the $U(N)$ matrices $X^i$ are interpreted as positions of points on strings in the transverse 8-dimensional space (the two light-like directions are treated separately in light-cone gauge: one of them, $X^+$, is the light-like time and the other, $X^-$, is compactified). Those eigenvalues $X^i_{nn}(\sigma)$ still depend on $\sigma$, the spatial coordinate of the cylinder on which the gauge theory is defined.
However, one may obtain strings of an arbitrary length by applying permutations on the eigenvalues: the length determines the light-like longitudinal momentum $p^+=N/R$ which is quantized because $X^-$ is compactified. All these permutations are allowed because $U(N)$ is gauged as a symmetry in the matrix model. Consequently, perturbative type IIA and HE string theory with arbitrary numbers of strings are defined by an orbifold conformal field theory - a single string propagating on the orbifold $R^{8N}/S_N$, if you wish (with the extra fermionic degrees of freedom, too). The permutations now guarantee not only the indistinguishability of strings in the same vibration states but also the existence of strings with higher values of $p^+$ - it looks like your configuration II on the world volume if you wish (but the path in the spacetime is generic) - as well as the validity of the $L_0=\tilde L_0$ condition in the continuum limit, among other things. Interactions work as expected, too.
The perturbative string theories always emerge in the light-cone gauge Green-Schwarz description. In the heterotic case, the $E_8$ groups arise from the fermionic representation of the $E_8$ current algebra: those extra fermions are fermions transforming in the fundamental representation of $U(N)$; sixteen of them per single HoĊava-Witten boundary i.e. per single $E_8$ while the gauge group has to be changed to $O(N)$ and some degrees of freedom (originally Hermitian matrices) become symmetric real tensors of $O(N)$ while others are antisymmetric, see the paper below and its followups:
The main advantage of matrix string theory is that while it may be explicitly shown to agree with type IIA or HE string theory at the weak coupling, it provides one with the exact non-perturbative description at any value of the string coupling. In particular, one may see that when the coupling is sent to infinity, matrix string theory reduces to the original BFSS matrix model for M-theory in large 11 dimensions (with an $E_8$ domain wall, in the heterotic case).
Similar matrix models exist for type IIB in ten dimensions, too: one needs the maximally supersymmetric $2+1$-dimensional superconformal field theory which became relevant for the BLG construction (which later transmuted to the ABJM membrane minirevolution). The methods of matrix models become more complicated for backgrounds with additional compact dimensions - by compactifying spacetime dimensions (dimensional reduction), one needs to add dimensions to the matrix model ("dimensional oxidation") - and no matrix models are known if more than 5 transverse spacetime coordinates are compactified (which is why we can't define matrix models for phenomenologically interesting compactifications, at least as of 2011).
By the way, a long list of introductory literature about all kinds of string-theoretical topics, most recently updated in 2004, is here: