Disclaimer: The answer below responded to the original version of the OP's question, which was quite different in nature and less specific than the current version.
$p(x∣w_1)$ and $p(x∣w2)$ are equal.
OK, this is going to take a lot longer to answer.
In some statistical applications, a statistician (or a machine, since
you included machine learning as a tag) needs to decide which of two
hypotheses is true: $H_1 \colon w = w_1$ and $H_2 \colon w = w_2$.
It is known that $$P(w = w_1) = P(w = w_2) = \frac{1}{2}.$$
This is what the equal a priori probabilities that you keep referring
to means.
Here is a simple method: Always decide that $w = w_1$, and so hypothesis
$H_1$ is always the true hypothesis. When in fact $H_1$ is true, your
decision is
perfectly correct; when in fact $H_2$ is true your decision is
perfectly wrong, and
thus you have a $50\%$ chance of making an error. More sophisticated methods
use a coin toss or a call to a random number generator to decide, but unfortunately still have a $50\%$ chance of making an an error; the same
as the simpler mulish insistence that $H_1$ is always true.
To get better performance, i.e., smaller error probability), the
statistician might observe a random variable whose distribution depends
on the value of $w$. If $w = w_1$, the distribution is $p(x\mid w_1)$;
if $w = w_2$, the distribution is $p(x\mid w_2)$. For example, if
$w = w_1$, $x$ is a normal random variable with mean $100$ and variance $1$,
while if $w = w_2$, $x$ is a standard normal random variable with mean $0$ and
variance $1$. So if the statistician observes that $x$ has value $101.2$,
it is highly likely that $w = w_1$ and thus very likely that
$H_1$ is true because a standard
normal random variable is quite unlikely to have large value. On the other
hand, if $x$ has small value (say between $-4$ and $+4$), then it is quite
likely that $H_2$ is true and $w = w_2$. But notice that all this depends
critically on the distributions $p(x\mid w_1)$ and $p(x\mid w_2)$ being
different. If the distributions are the same, then observing $x$ is of
no help in deciding between $H_1$ and $H_2$. Thus when you claim that
$p(x∣w_1)$ and $p(x∣w_2)$ are equal
you are effectively insisting that observing $x$ is useless as
far as deciding between $H_1$ and $H_2$ is concerned.
So, how are these distributions known in the first place? The client might provide
them to the statistician based on the knowledge of how the client's
apparatus works. Your professor, like Professor Indiana Jones in the
movie Raiders of the Lost Ark, might
be making them up as he goes along (Remember that $99\frac{44}{100}\%$
of all statistics are made up!). In the context of machine
learning, there may be training samples provided: Here are
$200$ observations of $x$ when $H_1$ is true, and here are
$200$ more when $H_2$ is true. (In your particular problem,
$x$ is a bivariate normal random variable with independent
(standard normal) components when $H_1$ is true and correlated
normal components when
$H_2$ is true, and so each sample would be a a pair of numbers).
The machine estimates
$p(x\mid w_1)$ from the first set of observations
and $p(x\mid w_2)$ from the second set, and uses these
estimates when making decisions when the real work comes
along.
In summary, your claim that $p(x\mid w_1) = p(x\mid w_2)$
means that $x$ is totally useless in distinguishing the two
cases. For your particular distribution, equality holds
(if you nevertheless contiunue to insist on equality)
exactly when $a=b=1$ and $c=d=e=0$ (in which case $ab-c^2 = 1$
as desired). There is no way of solving for $a,b,c,d,e$,
or saying what values of $a,b,c,d,e$ make sense in your
problem based on the information that you have provided.
You need to be given these by your professor,
or you need to be given training data so that you can estimate
these parameters, or you should emulate Professor Jones
and make up some numbers (subject to the constraints that $ab - c^2 = 1$
and $a, b > 0$) and solve the problem using these.
Best Answer
Following the paper you linked to (Mika et al., 1999), we have to find the $\mathbf{w}$ which maximizes the so called generalized Rayleigh quotient,
$$\frac{\mathbf{w}^\top \mathbf{S}_B \mathbf{w}}{\mathbf{w}^\top \mathbf{S}_W \mathbf{w}},$$
where for means $\mathbf{m}_1, \mathbf{m}_2$ and covariances $\mathbf{C}_1, \mathbf{C}_2$,
\begin{align} \mathbf{S}_B &= (\mathbf{m}_1 - \mathbf{m}_2)(\mathbf{m}_1 - \mathbf{m}_2)^\top, & \mathbf{S}_W &= \mathbf{C}_1 + \mathbf{C}_2. \end{align}
The solution can be found by solving the generalized eigenvalue problem \begin{align} \mathbf{S}_B\mathbf{w} = \lambda \mathbf{S}_W\mathbf{w}, \end{align} by first computing the eigenvalues $\lambda$ by solving \begin{align} \det(\mathbf{S}_B - \lambda \mathbf{S}_W) = 0 \end{align} and then solving for the eigenvector $\mathbf{w}$. In your case, $$\mathbf{S}_B - \lambda \mathbf{S}_W = \begin{pmatrix}16 - 3\lambda & 16 \\ 16 & 16 - 2\lambda\end{pmatrix}.$$ The determinant of this 2x2 matrix can be computed by hand.
The eigenvector with the largest eigenvalue maximizes the Rayleigh quotient. Instead of doing the calculations by hand, I solved the generalized eigenvalue problem in Python using
scipy.linalg.eig
and got $$w_1 \approx 0.5547, w_2 \approx 0.8321,$$ which is different from the solution you found in your book. Below I plotted the optimal hyperplane of the weight vector I found (black) and the hyerplane of the weight vector found in your book (red).$\hskip1in$