Your 'ideal' dipole is a uniform rigid rod with no mass (and hence its moment of inertia about any axis, through any point, is zero). As such, $p_{\theta}$ and its derivative $\dot{p_{\theta}}$ are trivially (always) zero since, assuming the dipole is fixed at its centre but is free to rotate in $\theta$ and $\phi$, $p_{\theta} = I_{CoM}\dot{\theta}$. Since $I_{CoM}$ is zero, so is $p_{\theta}$. The linear analogue of this would be trying to derive the equations of motion for a massless particle.
If you give your dipole a moment of inertia about an axis through its centre of mass and perpendicular to its length $I \equiv I_{Rod,CoM}$ (alternatively, you could model the dipole as a massless rod with two point masses on either end, giving a different $I$), and call the dipole moment $w$ (I won't use $p$ to avoid confusion with the momenta) one obtains the following Lagrangian (defining the $z$-axis (polar axis) to lie in the direction of $\vec{E_0}$):
$$\mathcal{L} = \frac{I}{2}[\dot{\theta}^2+sin(\theta)^2 \dot{\phi}^2] + wEcos(\theta)cos(\phi)$$
Note the presence of $cos{(\phi)}$ in the dipole interaction term; as per your original post, we're allowing freedom in both $\phi$ and $\theta$. This gives rise to the following conjugate momenta:
$$p_{\theta} = I \dot{\theta}\ ,\ p_{\phi} = Isin(\theta)^2 \dot{\phi}$$
This gives the Hamiltonian, expressed in $p_{\theta}$, $\theta$, $p_{\phi}$ and $\phi$ only:
$$H = \frac{p_{\theta}^2}{2I} + \frac{p_{\phi}^2}{2Isin(\theta)^2} - wEcos(\theta)cos(\phi)$$
From this, one can turn the handle and obtain the expressions for $\dot{\theta}$, etc, as desired, using Hamilton's equations as stated in your original post.
It is easy actually. To be precise $\Sigma$ is a smooth spacelike Cauchy surface of the spacetime and the considered space $V$ of solutions of the KG equations is made of solutions smooth and with compactly supported Cauchy data on $\Sigma$. Under these hypotheses, the relation between Cauchy data on $\Sigma$, $(f|_\Sigma, n_\Sigma \cdot\nabla f|_\Sigma)$ and corresponding solutions $f$ of the KG equation is one-to-one. In other words the Cauchy problem is well posed.
If $(f,h)=0$ for every $h$ then both $f$ and its derivative normal to $\Sigma$ are zero. This easily follows from the very form of the simplectic form and from the fact that we can choose the Cauchy data of $h$ arbitrarily and there is a corresponding $h$.
Hence, again in view of the well posedness of the Cauchy problem, $f$ must be the zero solution of the KG equation.
Best Answer
Part of your trouble may be related to the somewhat artificial splitting between $q$'s and $p$'s here. Let's take a $2$-dimensional configuration space with coordinates $(q^1,q^2)$. An arbitrary point in the cotangent bundle then has the four coordinates $x \equiv (q^1,q^2,p_1,p_2)$, so $x^1=q^1,x^2=q^2,x^3=p_1,x^4=p_2$.
In this language, the canonical $2$-form can be written $$\omega= \mathrm dx^3\wedge \mathrm dx^1 + \mathrm dx^4 \wedge \mathrm dx^2 $$ $$\equiv \frac{1}{2}\big(\mathrm dx^3\otimes \mathrm dx^1 - \mathrm dx^1 \otimes \mathrm dx^3 + \mathrm dx^4 \otimes \mathrm dx^2 - \mathrm dx^2 \otimes \mathrm dx^4\big)$$ If we feed this $2$-form two arbitrary vectors $A = \sum_{n=1}^4 A^n \frac{\partial}{\partial x^n}$ and $B = \sum_{n=1}^4 B^n \frac{\partial}{\partial x^n}$, we obtain $$\omega(A,B)= \frac{1}{2}\big(A^3B^1-A^1B^3 + A^4B^2 - A^2 B^4\big)$$ Recall that $\omega$ is non-degenerate $\iff \omega(A,B)=0$ for all $B$ implies that $A=0$. If the expression $\omega(A,B)$ written above is equal to zero for every possible vector $B$, what does that tell you about $A$?
I believe I understand your follow-up question. Let the phase space be given by $X$. Given a non-degenerate $(0,2)$-tensor $B:TX\times TX\rightarrow \mathbb R$, one can define a map $\flat: TX \rightarrow T^*X$ via $ A^\flat := B(A,\cdot)$, where $\cdot$ denotes an empty slot. In component form, $$(A^\flat)_\mu = B_{\mu\nu} A^\nu$$ If we define $\tilde B$ to be the $(2,0)$-tensor whose components are the matrix inverse of the components of $B$ (by which I mean, $\tilde B^{\mu\alpha} B_{\alpha\nu} = \delta^\mu_\nu$) then $\tilde B$ defines the inverse map $$\sharp : T^*X \rightarrow TX$$ $$(\alpha^\sharp)^\mu = \tilde B^{\mu\nu} \alpha_\nu$$
This association between the tangent vectors and covectors is called a musical isomorphism, and it is well-defined if and only if $B$ is non-degenerate (so that the matrix $B_{\mu\nu}$ is invertible).
With that out of the way, to each smooth function $f$ corresponds a so-called Hamiltonian vector field $X_f$ given by $X_f := (\mathrm df)^\sharp$, where the musical isomorphism is provided by $\omega$. In canonical coordinates, one has that
$$f \mapsto X_f = \sum_n\frac{\partial f}{\partial p_n} \frac{\partial}{\partial q^n} - \frac{\partial f}{\partial q^n} \frac{\partial}{\partial p_n}$$
The association $\mathrm df \leftrightarrow X_f$ is an isomorphism if and only if $\omega$ is non-degenerate. In explicit component form (in canonical coordinates), the association takes the form
$$\underbrace{\left(\frac{\partial f}{\partial q^1},\frac{\partial f}{\partial q^2},\frac{\partial f}{\partial p_1}, \frac{\partial f}{\partial p_2}\right)}_{\text{components of }\mathrm df}\leftrightarrow \underbrace{\left(\frac{\partial f}{\partial p_1},\frac{\partial f}{\partial p_2},-\frac{\partial f}{\partial q^1}, -\frac{\partial f}{\partial q^2}\right)}_{\text{components of }X_f}$$
It's obvious that this is an isomorphism - the components are just shuffled around, modulo a minus sign - and so the bilinear form $\omega$ from which it is derived must be non-degenerate.