First, I never liked working with principal bundles; vector bundles seem easier and more natural to me. Second, I never like thinking about abstract principal $G$-bundles. I prefer fixing a representation of $G$ and viewing the principal $G$ bundle as a reduced frame bundle associated with a vector bundle.
So let $E$ be a rank $k$ vector bundle and $F$ the bundle of arbitrary frames in $E$ (this is a principal $GL(k)$-bundle). Then $GL(k)$ acts on the right on $F$. Given a subgroup $G$ in $GL(k)$, let $F_G$ be a subbundle of $F$ such that if $f \in F_G$, then so is $f\cdot g$ for each $g \in G$.
The primary example is $E = T_*M$ and $F_G$ is the bundle of orthonormal bases of the tangent space with respect to a Riemannian metric.
What is the critical property we want a $G$-connection to satisfy? Well, any connection allows you to parallel translate an arbitrary frame $f \in F$ along a curve. We'd like the $G$-connection to be such that if $f \in F_G$, then the parallel translation remains in $F_G$. This leads to the right definition of a $G$-connection.
Your confusion is revealed in this sentence "Or one defines something called the exterior covariant derivative D (see wiki) and then the curvature is simply the exterior covariant derivative of the connection one-form." This is just not true; one does not take the `exterior covariant derivative of the connection $1$-form' to get the curvature.
Let's be precise: Let $P\to M$ be a principal right $G$-bundle, and let $\omega$ be a $\frak{g}$-valued $1$-form on $P$ that defines a connection on $P$ (I won't repeat the well-known requirements on $\omega$). The curvature $2$-form $\Omega = d\omega +\frac12[\omega,\omega]$ on $P$ is the $2$-form that vanishes if and only if it is possible to find local trivializations $\tau: P_U \to U\times G$ such that $\omega = (\pi_2\circ\tau)^*(\gamma)$ where $\gamma$ is the canonical left-invariant $1$-form on $G$.
Now, given a representation $\rho:G\to \text{GL}(V)$, where $V$ is a vector space, one can define an associated vector bundle $E = P\times_\rho V$. Using $\omega$, it is possible to define an 'exterior covariant derivative operator' $D_\omega:\Gamma(E\otimes A^p)\to \Gamma(E\otimes A^{p+1})$ (where $A^p\to M$ is the bundle of alternating (i.e., 'exterior') $p$-forms on $M$). The operator ${D_\omega}^2:\Gamma(E\otimes A^p)\to \Gamma(E\otimes A^{p+2})$ then turns out to be linear over the $C^\infty$ alternating forms, so it is determined by its value when $p=0$, i.e., by ${D_\omega}^2:\Gamma(E)\to \Gamma(E\otimes A^{2})$, which can be regarded as an section of $\text{End}(E)\otimes A^2$, i.e., a $2$-form with values in $\text{End}(E)$. The formula for this section, when pulled back to $P$, can now be expressed in terms of $\Omega$ in the usual way. In particular, ${D_\omega}^2$ vanishes identically if $\Omega$ does.
Note that one does not take the 'exterior covariant derivative' of the $1$-form $\omega$ anywhere. Instead, one takes the exterior covariant derivative of the exterior covariant derivative of a section of $E$.
I suspect that what you may be trying to do is interpret $\omega$ as a $1$-form on $P$ with values in the trivial bundle $P\times\frak{g}$ and then say that the curvature is the 'exterior covariant derivative' of $\omega$. However, to make this work, you have to specify a connection on the trivial bundle $P\times\frak{g}$, which amounts to choosing a $1$-form $\eta$ on $P$ that takes values in $\text{End}(\frak{g})$ and setting $D_\eta(s) = ds + \eta\ s$. By setting $\eta = \frac12\text{ad}(\omega)$, one gets $D_\eta\omega = \Omega$ (just by definition), so it is possible to do this, but I don't think that this is that useful an observation, since, after all, you could have taken the connection on the trivial bundle $P\times\frak{g}$ to be $\eta = \frac13\text{ad}(\omega)$ (for example) or even $\eta=0$. What justifies the $\frac12$, other than the desire to get the `right' answer?
Best Answer
There's something about the notation you should know before you get confused when trying to do non-abelian gauge theory. The second term in the field strength should involve a combination of the wedge product of forms and the Lie bracket: the field strength (in the case of an arbitrary gauge group $G$ with Lie algebra $\mathfrak{g}$) should actually be $$F = dA + \tfrac{1}{2}[A \wedge A],$$ where if $\omega$ is a $\mathfrak{g}$-valued $k$-form and $\eta$ is a $\mathfrak{g}$-valued $p$-form, then $$[\omega \wedge \eta](X_1, \dots, X_{k+p}) = \sum_{\sigma \in S_{k+p}} (-1)^{\text{sgn}(\sigma)} [\omega(X_{\sigma(1)}, \dots, X_{\sigma(k)}), \eta(X_{\sigma(k+1)}, \dots, X_{\sigma(k+p)})]$$ for any $k + p$ vector fields $X_1, \dots, X_{k+p}$. In particular, if $A$ is a $\mathfrak{g}$-valued $1$-form, then $$[A \wedge A](X_1, X_2) = [A(X_1), A(X_2)] - [A(X_2), A(X_1)] = 2[A(X_1), A(X_2)].$$ So in components, the field strength is given by $$F_{\mu\nu} = \partial_\mu A_\nu - \partial_\nu A_\mu + [A_\mu, A_\nu],$$ which is the form you'll see most frequently in the physics literature.
When the gauge group $G$ is abelian (e.g. in ${\rm U}(1)$ gauge theory), the Lie bracket on $\mathfrak{g}$ is trivial so that $[A \wedge A] \equiv 0$ and the field strength is just the exterior derivative of the gauge potential: $F = dA$.