The divide is actually not between covalent and ionic, but rather a spectrum between localised and delocalised electrons. The history of all this is actually quite fascinating, and Phil Anderson in his book "More and Different" has a nice chapter on this. Essentially, around the time that people started doing quantum mechanics on molecules seriously, there were two schools of thinking which dominated.
On one side was Mott and more popularly, Hund and Pauli who thought of electrons as primarily attached to atoms and through electromagnetic interactions their motions/orbitals would be deformed and one gets molecules. This is the version usually taught in chemistry classes as with a few rules of thumb it is possible to qualitatively account for a vast range of behaviours.
On the other side was Slater with a dream of a machine which could simply compute the electronic structure by giving it the atoms and electrons. In this picture, the electrons are primarily thought of as delocalised over all the atoms, and through a rigorous procedure of perturbation theory one adds the effect of interactions between electrons and may achieve arbitrarily good precision.
The latter has the problem that the results are not intuitive --- there are no rules of thumb available and one is reduced to simply computing. The problem with the former is that to achieve high accuracy, the "rules of thumb" become exceedingly complex and are not really very easy to use or to compute with --- it lacks the simple regularity of the Slater dream machine. It is telling that essentially the latter has won, and nowadays it is routine to compute the electronic structure of quite large molecules (~1000 atoms) through brute-force (the technique is known as density functional theory, and there are commercial software available to do it).
In finite molecules one can actually show that in principle both approaches will work --- technically we speak of there being an adiabatic connection between the localised and delocalised states. The only practical difference is just how hard it is to carry out the calculations. However, in infinite molecules (e.g. solid crystals) this is not true, and there can be a proper phase transition between the two starting points. In that case, the localised approach corresponds to what is fancily called these days "strongly correlated systems" such as Mott insulators and magnetically ordered materials, and the delocalised approach are essentially metals (technical language: renormalises to be a Fermi liquid).
Nowadays there is a desire (from theoretical condensed matter physicists) to develop the localised approach again, as it may be possible to find some useful rules of thumb regarding magnetic materials, a prominent example of which are the high temperature superconductors.
In physical chemistry, this problem is usually treated in MO-LCAO theory.
What you do is to assume that you can create the molecular orbitals of the molecule as a linear combination the atomic orbitals of the atoms in the molecule (MO-LCAO stands for Molecular Orbitals - Linear Combination of Atomic Orbitals). Therefore, your atomic orbitals are a mathematical basis set on which you project (using some coefficients) your molecular orbitals.
The problem is further simplified if you consider that the atomic orbitals that will combine together should have the same character for the symmetry operations possible for that molecule (it means that every atomic orbital combining should belong to the same point group, in order for their linear combinations to belong to that group). You can therefore create the SALC (Symmetry Adapted Linear Combinations), linear combinations of atomic orbitals of the same point group, and use them as a more powerful mathematical basis set for the molecular orbitals.
Stated this, you can calculate the coefficients of the linear combination and the energy of each molecular orbital. What you get is a certain number of levels (same number of the atomic orbitals considered in your basis set) ordered by their energy. You can now distinguish between three types of molecular orbitals:
bonding, the atomic orbitals constructively interfere in the region between the two atoms;
antibonding, the atomic orbitals destructively interfere in the region between the two atoms;
non bonding, the molecular orbital is almost identical to one atomic orbital (the coefficient of a certain atomic orbital is way greater than the others).
You can distinguish (at a very basic level) between them by representing the atomic orbitals involved and their sign in the region between the atoms: if they have the same sign, they are bonding, else they are antibonding. (Please note that by doing this I forget about the magnitude of the coefficient, that should be relevant in most cases.)
Now you have a sort of "ladder" of molecular orbitals and you know if each step is bonding or not. You can now put the electrons (same number as the sum of the electrons that where in the atomic orbitals you used in your basis set) as you did for isolated atoms: from bottom to top, two electrons in each level, antiparallel spin, and so on (the same rules also if you have more levels at the same energy).
You can now go back to a classical chemistry framework using the so called bond order:
$$ BO =1/2( n-n^*)$$
where $n$ is the number of electrons in bonding orbitals and $n^*$ is the number of electrons in antibonding orbitals (non bonding orbitals just doesn't count). The bond order tells (if it is an integer) how many bonds we represent in a classical picture, thus going back to the concept of octet rule.
In fact, consider the valence shell of oxygen. It is made by the atomic orbitals $2s$, $2p_x$, $2p_y$, $2p_z$ and it contains six electrons. By combining these (and ignoring the interaction between $2s$ and $2p_z$, that could be possible and that only modifies the energy of these molecular orbitals) you get $4\times 2$ molecular orbitals (the apex * means that they are antibonding).
The electrons for oxygen are black (red ones are added when considering the F$_2$ molecule).
The bonding molecular orbitals from a shell of this type are four, therefore the total of the bonding electrons are eight. Here comes the octet rule, but this kind of reasoning is trying to fit an empirical and wrong way of reasoning into a more powerful and quantum framework.
Please note that my answer is from a really introductory and basic point of view; things, starting from this, can become a lot more complicated.
Best Answer
The covalent bonds form when electrons attached to nearby nuclei can exist in a superposition state where they can partly be on another nucleus. This happens when the electron state they are mixing with is unfilled.
For example, for H2, two hydrogen nuclei are close, there is no electrostatic energy for this in the first approximation because the electron and proton are both spherical electromagnetic sources. But when they come close, and the spin of the two electron is opposite (this is required for binding), each electron will spread out to cover the other atom, overlapping with the other electron (this is allowed because they have opposite spin, and so do not feel Pauli exclusion), and this reduces the kinetic energy of the electron.
The reason is simply that when you allow an electron to wander over a larger space, the kinetic energy always goes down. If you double the size of the space in one direction, the kinetic energy in that direction goes down by a factor of 4. If you consider the two H-atoms as two boxes, doubling the x-size of the box keeping the y and z sizes the same, reduces the kinetic energy from X+X+X to X/4 + X +X or by a factor of 3/4, so the binding energy of two boxes end to end with non-interacting electrons is 1/4 the kinetic energy.
The kinetic energy of an electron in an H-atom is equal to the binding energy (this is the Virial theorem--- the kinetic energy cancels half the potential energy in a 1/r potential to make a binding energy), so you get 1/4 of 12 eV or 3eV of binding energy from this. This is a terrible approximation, because the elecrons repel each other, and the H-atom is not a box, but it shows you that allowing the electron volume to spread gains you a lot of energy on the atomic scale, and it is now plausible that even with repulsion, the electrons will bind, and they do.