I am reading about the Hotelling $T^2$ test (*A primer of multivariate statistic s by Richard J. Harris*). It says here that the test can be seen as creating a linear combination of your variables and then obtaining the set of coefficients that maximizes the 'combined t radio'.

So I start on the math and I see how the t test is defined as:

$t(a) = \frac{a^T(\overline{X}-\mu_0)}{\sqrt{a^TSa/N}}$

When taking the square of this to get rid of the square root it looks like this:

$t^2(a) = \frac{N[a^T(\overline{X}-\mu_0)]^2}{a^TSa}$

So far so good. Now we formulate an optimization function like this:

$h(a)= N[a^T(\overline{X}-\mu_0)]^2 – \lambda(a^TSa-1) $

So basically the idea here is to maximize $h^2(a)$ subjected to the condition that $a^TSa = 1$. *The lambda term appears as later on the idea is to show that the characteristic root of $NS^{-1}(\overline{X}-\mu_0)(\overline{X}-\mu_0)^T$ is the solution of the optimization problem.*

**Now here comes my question/problem:**

The next thing that is done in this book is to derive $h(a)/da$. When I try to work this out knowing that for a symmetric matrix $A$:

$\frac{\partial x^TAx}{\partial x} = 2x^TA$

I obtain:

$h(a)/da = 2Na^T(\overline{X}-\mu_0)(\overline{X}-\mu_0)^T – 2\lambda a^TS$

However, the book and pretty much everywhere else shows this derivative as:

$h(a)/da = 2N(\overline{X}-\mu_0)(\overline{X}-\mu_0)^Ta – 2\lambda Sa$

What am I doing wrong? I know that $(\overline{X}-\mu_0)(\overline{X}-\mu_0)^T$ is symmetric (product of a matrix by its transpose always is), and also $S$ is symmetric because it is the estimated covariance matrix.

## Best Answer

Both solutions are equivalent; only the presentation changes (row vector in one case, column vector in the other).