The Poynting vector itself represents the energy flux density of an electromagnetic field. In other words, its magnitude is the energy per unit area per unit time carried by the field at a particular location. Therefore, it has units of W/m^2. If you integrate its divergence over a surface, you obtain the total power radiated through that surface, which of course has units of W.
As for the issue of adding the curl, note that it is a basic vector-calculus identity that the curl of any vector field has zero divergence. Therefore, it must contribute nothing to the integral of the divergence of the Poynting vector. However, we must be very careful in reading your Wikipedia article. The article says that
adding a field to [the Poynting vector] which has zero divergence will result in a field which satisfies this required property of a Poynting vector field according to Poynting's theorem.
It does not, however, say that the field that you get when adding the curl is still the energy flux density as computed with $\vec{E}\times\vec{H}$. If you add some weird random field, it will not represent the local energy flux density anymore, but it will still satisfy Poynting's theorem.
In summary, adding a curl of some random field destroys the local meaning of the Poynting vector, but preserves the meaning of the surface integral of its divergence.
Best Answer
You can derive the Poynting vector from the Lorentz force law, $\vec F = Q(\vec E + \vec v \times \vec B)$ like this:
(using the work-energy theorem)
$dW = dq(\vec E + \vec v \times \vec B)\cdot dl = dq(\vec E + \vec v \times \vec B)\cdot \vec v dt$
so
$\frac{dW}{dt} = \int_\tau(\vec E + \vec v \times \vec B)\cdot \vec v\rho d^3r = \int_\tau (\vec E \cdot \vec J)d^3r$.
Ampere's law says:
$\vec J = \frac{1}{\mu_0}(\vec \nabla \times \vec B) - \epsilon_0 \frac{\partial \vec E}{\partial t}$
so
$\vec E \cdot J = \vec E \cdot[\frac{1}{\mu_0}(\vec \nabla \times \vec B) -\epsilon_0\frac{\partial \vec E}{\partial t}] =$
(using BAC-CAB) identity)
$\frac{1}{\mu_0}[-\vec \nabla \cdot(\vec E \times \vec B) + \vec B \cdot(\vec \nabla \times \vec E)] - \vec E \cdot \epsilon_0\frac{\partial \vec E}{\partial t} =$
(using Faraday's law)
$\frac{1}{\mu_0}[-\vec \nabla \cdot(\vec E \times \vec B) - \vec B \cdot \frac{\partial \vec B}{\partial t}] - \vec E \cdot \epsilon_0\frac{\partial \vec E}{\partial t} =$
$-\frac{1}{2}\frac{\partial}{\partial t}(\epsilon_0 E^2 + \frac{B^2}{\mu_0}) - \nabla \cdot \frac{\vec E \times \vec B}{\mu_0}$.
Substituting this in the expression for $\frac{dW}{dt}$ and using the divergence theorem we get:
$\frac{dW}{dt} = -\frac{d}{dt}\int\frac{1}{2}(\epsilon_0 E^2 + \frac{B^2}{\mu_0})d^3r - \oint\frac{(\vec E \times \vec B)}{\mu_0}$.
But $\frac{\epsilon_0 E^2}{2}$ $\frac{B^2}{2\mu_0}$ are just the energies per unit volume of the electric and magnetic fields, respectively. Setting the surface integral on the right hand side to 0 for the time being, what the left hand side of the equation is saying is that the rate that the charges in the volume gain energy is the same as the rate that the electric and magnetic fields in the volume lose energy.
If the sum of the electric and magnetic fields is constant in time, however, then $-\frac{d}{dt}\int\frac{1}{2}(\epsilon_0 E^2 + \frac{B^2}{\mu_0})d^3r = 0$. If $\frac{dW}{dt}$ is not identically 0, however, this must mean that there is a flux of energy into or out of the volume to sustain the increase or decrease in energy. This flux is precisely the integrand in right hand side, or $\frac{(\vec E \times \vec B)}{\mu_0}$.