Let me normalize the action as
$$S=\frac{k}{4\pi}\int\langle A\wedge dA + \frac{1}{3} A\wedge[A\wedge A]\rangle$$
for $\langle,\rangle$ being the Killing form. This coincides with your normalization for $SU(N)$.
Variation of the Chern-Simons action under a gauge transformation $g:M\rightarrow G$ is given by $$S\rightarrow S + \frac{k}{24\pi}\int_{g_*[M]} \langle\theta\wedge[\theta\wedge\theta]\rangle,$$ where $\theta\in\Omega^1(G;\mathfrak{g})$ is the Maurer-Cartan form (Proposition 2.3 in http://arxiv.org/abs/hep-th/9206021). The last term is also called the Wess-Zumino term. Therefore, $\exp(iS)$ is invariant if $$\frac{k}{24\pi}\int_{[C]} \langle\theta\wedge[\theta\wedge\theta]\rangle\in2\pi\mathbf{Z}$$ for $[C]$ the generator of $H_3(G;\mathbf{Z})$.
For $G=SO(N)$, the homology is generated by $SO(3)\subset SO(N)$, and that term can be computed as follows. As you say, $$\frac{1}{24\pi}\int_{SU(2)} \langle\theta\wedge[\theta\wedge\theta]\rangle=2\pi,$$ but $SU(2)\rightarrow SO(3)$ is a 2:1 local diffeomorphism, so
$$\frac{1}{24\pi}\int_{SO(3)} \langle\theta\wedge[\theta\wedge\theta]\rangle=\pi.$$
Therefore, the level $k$ in this case has to be even. See also appendix 15.A in the conformal field theory book by Di Francesco, Mathieu and Senechal.
As you say yourself, indeed every connection on a bundle is locally given by a Lie algebra valued 1-form and in general only locally.
Let's say this more in detail: for $X$ any manifold, a $G$-principal connection on it is (in "Cech data"):
a choice of good open cover $\{U_i \to X\}$;
on each patch a 1-form $A_i \in \Omega^1(U_i)\otimes \mathfrak{g}$;
on each double intersection of patches a gauge transformation function $g_{i j} \in C^\infty(U_i \cap U_j, G)$
such that
on each double intersectin $U_i \cap U_j$ we have the equation $A_j = g_{i j}^{-1} A g_{i j} + g_{i j}^{-1} \mathbf{d} g_{i j}$
on each triple intersection $U_i \cap U_j \cap U_k$ we have the equation $g_{i j} g_{j k} = g_{i k}$.
Okay, now you would like to form a Chern-Simons 3-form... something out of this. What you immediately get from the above data is a bunch of local differential 3-forms, one on each patch: $CS(A_i) \in \Omega^3(U_i)$.
To make these 3-forms globally glue together to what is called a 3-form connection we need the evident data of higher gauge transformation
on each patch we have the local 3-form $CS(A_i)$;
on each double intersection there should be a 2-form $B_{i j} \in \Omega^2(U_i \cap U_j)$ which gauge transforms the respective CS-3-forms into each other, by $CS(A_j) = CS(A_i) + \mathbf{d} B_{i j}$;
on each triple intersection there should be a 1-form $\alpha_{i j k} \in \Omega^1(U_i \cap U_j \cap U_k)$ which exhibits a second-order gauge transformation ("ghosts of ghosts"!) between the first order gauge trasformations, in that $B_{i j} + B_{j k} = B_{i k} + \mathbf{d} \alpha_{ i j k}$
finally on each quadruple intersection there should be a smooth function $h_{i j k l} \in C^\infty(U_i \cap U_j \cap U_k \cap U_l, U(1))$ which gauge-of-gauge-of-gauge-transforms the gauge-of-gauge-transforms into each other, in that $\alpha_{i j k} + \alpha_{i k l} = \alpha_{j k l} + \alpha_{i j l} + h_{i j k l}^{-1}\mathbf{d}h_{i j k l}$.
That's the data that makes the local Chern-Simons 3-form into a globally well-defined 3-form field. (For instance the supergravity C-field is of this form, with some further twists and bells and whistles added, as we have discussed here).
In mathematical language one says that this kind of local gauge-of-gauge-of-gauge gluing data for global definition of higher form fields is a "degree-4 cocycle in Cech-Deligne cohomology". This is precisely the right data needed to have a well-defined 3-dimensional higher holonomy as is needed here for the definition, because the Chern-Simons action functional is nothing but the 3-dimensional higher holonomy of this 3-form connection.
If you can build it, that is. From the above it is not entirely obvious how to build the 3-form cocycle data $\{CS(A_i), B_{i j}, \alpha_{i j k}, h_{i j k}\}$ from the given gauge field data $\{A_i, g_{i j}\}$.
But this can be done. This is what Cheeger-Simons differential characters were discovered for. An explicit construction that is very natural for the application to Chern-Simons theory we have given in
- Fiorenza, Schreiber, Stasheff, Cech cocycles for differential characteristic classes, Advances in Theoretical and Mathematical Phyiscs, Volume 16 Issue 1 (2012) (arXiv:1011.4735, web)
Based on this we give a detailed introduction to and discussion of Chern-Simons action functionals for globally non-trivial situations like above in
- Fiorenza, Sati, Schreiber, A higher stacky perspective on Chern-Simons theory (arXiv:1301.2580, web)
That article gives the local formulas that apply generally, discusses the simplifications that occur when the 3-manifold can be assumed to be bounding, discusses what happens if not, and then explores various other properties of globally defined Chern-Simons theory, such as how to couple Wilson lines to the above story. If you just look at the first part, I think you should find what you need.
edit: In the comments below came up the question why a similar discussion is not also needed when writing down the Yang-Mills action functional, whose Lagrangian is the 4-form $\langle F_A \wedge \star F_A \rangle$ (where $\star$ is the Hodge star of a given metric (gravity) and $\langle -,-\rangle$ is an invariant polynomial, the "Killing form" or trace), or similarly the topological Yang-Mills action functional, whose Lagrangian is the 4-form $\langle F_A \wedge F_A \rangle$.
The reason is that these Lagrangians are built from curvatures evaluated in an invariant polynomial. The very invariance of these invariant polynomials under the adjoint action of the gauge group on its arguments ensures that if $\{U_i \to X\}$ is a good open cover of 4-dimensional space(-time) and if the gauge field is given by the Cech-cocycle data $\{A_i, g_{i j}\}$ with respect to these local patches, that then on double overlaps the two (topological or not) Yang-Mills Lagrangians coming from two patches are already equal
$$
\langle F_{A_i} \wedge F_{A_i}\rangle = \langle F_{A_j}\wedge F_{A_j}\rangle
\,.
$$
Hence if we write $\nabla = \{A_i, g_{i j}\}$ for the gauge field connection abstractly and denote the (topological) Yang-Mills Lagrangian globally by $\langle F_\nabla \wedge F_\nabla\rangle$, then this is already a globally defined 4-form. Mathematically, this statement is what is at the core of Chern-Weil theory.
Notice that there is nevertheless an intricate relation to the story of the Chern-Simons functional. Namely the local Chern-Simons form $CS(A_i)$ has the special property (essentially by definition) that its differential is the topological Yang-Mills Lagrangian:
$$
\mathbf{d}CS(A_i) = \langle F_{A_i} \wedge F_{A_i}\rangle
\,.
$$
This means that with the Chern-Simons Lagrangian regarded as a 3-form connection then the topological Yang-Mills Lagrangian is its curvature 4-form. Therefore the relation between the topological Yang-Mills Lagrangian 4-form and the Chern-Simons 3-form is precisely an analogue in higher gauge theory of the familiar relation two degrees down of how the electromagnetic potential 1-form -- which is not globally defined in general -- has a curvature 2-form that is globally well defined.
Mathematically this is why Chern-Simons functionals are called "secondary invariants"
Indeed, this is a bit more than just an analogy: the Chern-Simons 3-form is precisely a doubly higher analog of the electromagnetic field as we pass from the point, via the string, to the membrane.
I have some lecture notes with more along these lines at nLab:twisted smooth cohomology in string theory.
Best Answer
First, the full paper is here:
Second, the paper has 150 citations. See all this information at INSPIRE (updated SPIRES):
Third, the text between 3.4 and 3.5 looks totally comprehensible. At that point, they are able to define $n\cdot S$ modulo 1, which is equivalent to defining the action $S$ modulo $1/n$. The goal is to define the action $S$ itself modulo 1; I suppose that their normalization of the path integral has to have $\exp(2\pi i S)$ with the atypical $2\pi$ factor. Yes, confirmed, it's equation 1.2.
If you shift the action by 1 - or $2\pi$ in the ordinary conventions - it doesn't change the integrand of the path integral; it doesn't change the physics. So quite generally, if one is able to say that the action $S$ is equal to $S_0+n$ (or $2\pi n$ normally) for some integer $n$, he knows everything about the physics of the action he needs; shifting it by an integer doesn't change anything. That's why, in fact, the action is often defined modulo 1 only (up to the addition of an integer multiple of 1).
So it's enough to know the "fractional part" of the action; the integer part is irrelevant. However, at the point of the equation 3.4, their uncertainty is larger than that: they only know the action modulo $1/n$. For example, if the action is $9.37$ modulo $1/2$, it means that the fractional part may be $0.37$ but it may also be $0.87$. These two values of $S$ would change the physics because the contribution of the configuration to the path integral changes the sign if one changes $S$ by $1/2$ (in normal conventions, by $\pi$).
If one only knows $S$ modulo $1/n$, and if he thinks it's $S_0$ - in this case, the $F\wedge F$ expression - it means that the real action is $$ S = S_0 + K/n $$ and the integer $K$ has to be determined. Because the change of the action $S$ by an integer doesn't change physics, it doesn't matter if $K$ in the equation above is changed by a multiple of $n$. So the goal is to find the right $K$ to define the action - and $K$ is an unknown integer defined (or relevant) modulo $n$, i.e. up to the addition of an irrelevant and arbitrary multiple of $n$.
At some point, they find the right answer and it is $$ K = -\langle \gamma^*(\omega),B\rangle $$ which removes the ambiguity of $S$ - the missing knowledge whether $S$ should be the original $S$ or higher or smaller by a particular multiple of $1/n$. If you don't understand the text above, then apologies, I have no way to find out why, so I can't give you a better answer unless you improve your question.