The answer of David Aasen is correct, but let me add some comments which connect to your question of the relation of between the $\mathbb Z_2$ invariant $\nu$ and the first Chern-Number $C_1$.
Such a relation does not exist unless you require some extra symmetry than the generic symmetries usually required in the classification of topological insulators (such as time-reversal invariance in this case). Say the Hamiltonian is invariant under spin rotations along the $z$-axis (so a $U(1)$ subgroup of $SU(2)$ in left invariant), then the Hamiltonian can be block-diagonalized as
$H = \begin{pmatrix}
H_\uparrow & \\ & H_\downarrow
\end{pmatrix},
$
where the indices refer to spin-up and down degrees of freedom. Due to time reversal symmetry we have that $H_\downarrow(k) = H^*_\uparrow(-k)$. The system now consist of two copies of Quantum Hall effects with counter propagating edge states of opposite spin. As Davis Aasen says, the chern number is zero $C_1 = C_1^\uparrow + C_1^\downarrow = 0$. The difference however, the "spin Chern number", $C_1^\uparrow - C_1^\downarrow = 2C_{spin}$ can be non-zero and can be calculated by the Chern-numbers of the spin up/down sectors. As long as $S_z$ is preserved the spin Chern-number can be any integer $C_{spin}\in\mathbb Z$.
But if we add off-diagonal elements, and thus break the rotation symmetry along $z$, the invariant breaks down to $\nu = C_{spin}\,\text{mod}\,2\in\mathbb Z_2$ (as was shown by Kane and Mele). So topological trivial/non-trivial phases are characterized by even and odd spin-Chern numbers $C_{spin}$, not the original Chern number $C_1$. This however only makes sense when you have this extra symmetry.
Sorry this answer got too long. I have categorized it into three points.
(1)
I think the reason Kohmoto stresses the importance of the Brillouin zone being a torus $BZ = T^2$, is because he wants to say that BZ is compact and has no boundary. This is important because of the subtlety that makes everything work. The Hall conductance is given by $\sigma_{xy} = -\frac{e^2}h C_1$ (eq. 4.9), where the first Chern number is (eq. 4.8)
$C_1 = \frac i{2\pi}\int_{BZ} F = \frac i{2\pi}\int_{BZ} dA$.
However by naively using Stokes theorem $\int_M dA = \int_{\partial M} A$, where $\partial M$ is the boundary of $M$. Since $BZ= T^2$ and the fact that the torus has no boundary $\partial T^2$, this seem to imply that $\int_{\partial BZ} A = 0$ and thus $\sigma_{xy}=0$. There is however an important subtlety here, our use of Stokes theorem is only correct if $A$ can be constructed globally on all of $BZ$ and this cannot be done in general. One has to split the $BZ$ torus into smaller patches and construct $A$ locally on each patch, which now do have boundaries (see figure 1). The mismatch between the values of the $A$'s on the boundaries of the patches will make $\sigma_{xy}$ non-zero (see eq. 3.13).
In terms of de Rahm cohomology one can say that $F$ belongs to a non-trivial second cohomolgy class of the torus, or in other words the equation $F = dA$ is only true locally not globally. And that's why our use of Stokes theorem was wrong.
In this case, you can actually replace the torus with a sphere with no problem (why that is requires some arguments from algebraic topology, but I will shortly give a more physical picture of this). In higher dimensions and in other types of topological insulators there can be a difference between taking $BZ$ to be a torus or a sphere. The difference is that with the sphere you only get what people call strong topological insulators, while with $BZ=T^2$ you also get the so-called weak topological insulators. The difference is that, the weak topological insulators correspond to stacks of lower-dimensional systems and these exist only if there is translational symmetry, in other words they are NOT robust against impurities and disorder. People therefore usually pretend $BZ$ is a sphere, since the strong topological insulators are the most interesting anyway. For example the table for the K-theoretic classification of topological insulators people usually show (see table I here), correspond to using the sphere instead of torus, otherwise the table will be full of less interesting states.
Let me briefly give you some physical intuition about what $\sigma_{xy}$ measures by making an analogy to electromagnetism. In a less differential geometric notation, one can write (eq. 3.9)
$C_1 = \frac i{2\pi}\oint_M \mathbf B\cdot d\mathbf S$,
where $\mathbf B = \nabla_k\times \mathbf A$ can be though of as a magnetic field in k-space. This is nothing but a magnetic version of the Gauss law and it measures the total magnetic flux through the closed surface $M$. In other words, it measures the total magnetic charge enclosed by the surface $M$ (see also here). Take $M=S^2$, the sphere. If $C_1 = n$ is non-zero, that means that there are magnetic monopoles inside the sphere with total charge $n$. In conventional electromagnetism $C_1$ is always zero, since we assume there are no magnetic monopoles! This is the content of the Gauss law for magnetism, which in differential form is $\nabla\cdot\mathbf B = 0$. The analogue equation for our k-space "magnetic field" would be $\nabla\cdot\mathbf B = \rho_m$, where $\rho_m$ is the magnetic charge density (see here). If $M=BZ=T^2$ the intuition is the same, $C_1$ is the total magnetic charge inside the torus.
Another way to say the above is that the equation $\mathbf B = \nabla\times\mathbf A$ as we always use and love, is only correct globally if there are no magnetic monopoles around!
(2)
Now let me address the next point about Gauss-Bonnet theorem. Actually Gauss-Bonnet theorem does not play any role here, it is just an analogy. For a two-dimensional manifold $M$ with no boundary, the theorem says that $\int_M K dA = 2\pi (2-2g)$. Here $K$ is the Gauss curvature and $g$ is the genus. For example for the torus, $g=1$ and the integral is zero as you also mention. This is not the same as $C_1$ however. The Gauss-Bonnet theorem is about the topology of the manifold (for example the $BZ$ torus), but $\sigma_{xy}$ is related to the topology of the fiber bundle over the torus not the torus itself. Or in other words, how the Bloch wavefunctions behave globally. What plays a role for us is Chern-Weil theory, which is in a sense a generalization of Gauss-Bonnet theorem. The magnetic field $\mathbf B$, or equivalently the field strength $F$, is geometrically the curvature of a so-called $U(1)$ bundle over $BZ$. Chern-Weil theory says that the integral over the curvature
$C_1 = \frac i{2\pi}\int_{BZ} F$
is a topological invariant of the $U(1)$ bundle. This is analogous to Gauss-Bonnet, which says that the integral over the curvature is an topological invariant of the manifold. Thus this connection is mainly an analogy people use to give a little intuition about $C_1$, since it is easier to see the curvature $K$ than the curvature $F$ which is more abstract.
(3)
The comment of Xiao-Gang Wen is correct and to explain it requires going into certain deep issues about what is topological order and what is a topological insulator and what the relation between them is. The distinction between these two notions is very important and there are lots of misuse of terminology in the literature where these are mixed together. The short answer is that both notions are related to topology, but topological order is a much deeper and richter class of states of matter and topology (and quantum entanglement) plays a much bigger role there, compared to topological insulators. In other words, topological order is topological in a very strong sense while topological insulator is topological in a very weak sense.
If you are very interested, I can post another answer with more details on the comment of Xiao-Gang Wen since this one is already too big.
Best Answer
The 'topological' in topological order means 'robust against ANY local perturbations'.
According to such a definition, topological insulator is not 'topological' since its properties are not robust against ANY local perturbations, such as the perturbation that break the U(1) and time reversal symmetry. So a more proper name for topological insulator is 'U(1) and time-reversal symmetry protected insulator', which is one example of SPT order.
Some example of topologically ordered states (in the sense of 'robust against ANY local perturbations'):
1) $\nu=\frac{1}{3}$ FQH state
2) $Z_2$ spin liquid state
3) $\nu=1$ IQH state
4) $E_8$ bosonic QH state
The example 3) and 4) have no non-trivial topological quasi-particles (ie no non-trivial statistics, no non-trivial topological degeneracy), but have gapless edge state that is 'robust against ANY local perturbations'.
-- Edit -- (I lifted some discussions below to here):
There are two kinds of topology in math. The "topology" in "topological order" is directly related to the first kind of topology in mathematics, as in algebraic topology, homology, cohomology, tensor category. The "topology" in "topological order" is different from the "topology" in "topological insulator". The "topology" in "topological insulator" is related to the second kind of topology in mathematics, as in mapping class, homotopy, K-theory, etc. The first kind of topology is algebraic, while the second kind of topology is related to the continuous manifold of finite dimensions. We may also say that the first kind of topology is "quantum", while the second kind of topology is "classical".
The correct way to describe any gapped phases (such as topological orders and topological insulators) is to use the first kind of topology -- "quantum" topology, because the gapped phases are usually interacting. The second kind of topology -- the "classical" topology -- can be used to describe the one-body physics (include free fermion systems). The "classical" topology cannot be used to describe interacting many-body systems, which need "quantum topology".
One needs to go beyond "filling energy level" picture to understand topological order (the first kind of topology). Our education in traditional condensed matter physics (or traditional many-body physics) is almost all about "filling energy levels" (such as Landau Fermi liquid theory, band theory, etc), which is a trap that limit our imagination. The second kind of topology (the "topology" in "topological insulator") can be understood within the framework of "filling energy level" picture.
To answer the question What are the geometric properties of states with topological order from which we could deduce the topological order with some kind of Chern number (but without starting from a Chern-Simons field theory and putting in the right one by hand ;) ). Is there anything like this? I like to say that topological order is algebraic, not geometric. So the topological invariants of topological order are very different from Chern numbers. The robust ground state degenercy and the robust non-Abelian geometric phases of the degenerate ground states are the topological invariants of topological order (which are the analogues of the Chern number).