Weighted inner product with arbitrary matrix

inner-productslinear algebramultilinear-algebrapositive definitetensor-products

An inner product can be written in Hermitian form
$$
\langle x,y \rangle = y^*Mx
$$

that requires $M$ to be a Hermitian positive definite matrix.

I have read that using Hermitian positive definite matrix lets the inner product be interpreted as scaling the space by eigenvalues in eigenvectors directions. But I also found out about indefinite inner product where $M$ could be indefinite. Then it is natural to relax $M$ further, e.g., to arbitrary square matrix.

What is the significance of the forms of $M$? What are their reasons and effects?

Even the first Hermitian positive definite form of $M$ is a bit puzzled. Why do we still use the notation $\langle x,y \rangle$ after introducing $M$? Does it mean $M$ will not change the value of $\langle x,y \rangle$?

I would be curious in the intuition and the mathematical consequences when we relax these requirements. Especially appreciating testable predictions on the properties of the inner product.

I work on numerical applications, usually I can only constrain the structure but not the value of $M$, so degeneracy cases come naturally. I would like to see insights in the properties of these degenerate instances of $M$, like what properties I sacrifice or gain by each relaxation.

Best Answer

Even the first Hermitian positive definite form of $M$ is a bit puzzled. Why do we still use the notation $\langle x,y \rangle$ after introducing $M$? Does it mean $M$ will not change the value of $\langle x,y \rangle$?

This is a characterisation result. For any inner product on $\Bbb{C}^n$, there is a unique Hermitian $n \times n$ matrix $M$ such that $\langle x, y \rangle = y^* M x$. Finding such a matrix $M$ doesn't change the value of $\langle x, y \rangle$; it just gives you another way to express the same inner product, this time in terms of matrix multiplication. It's just like expressing the dot product $x \cdot y = y^\top I x$ doesn't change the dot product, just gives us another way to express the same function.

But I also found out about indefinite inner product where $M$ could be indefinite. Then it is natural to relax $M$ further, e.g., to arbitrary square matrix.

There's another relaxation: pseudo-inner products, where $M$ is Hermitian and invertible. Axiomatically (so that it works on more than just $\Bbb{R}^n$ and $\Bbb{C}^n$), this corresponds to replacing the positive-definite axiom with a non-degeneracy requirement, stating that $\langle v, w \rangle = 0 \; \forall w \implies v = 0$, in other words, only the $0$ vector is orthogonal to everything.

Pseudo-orthogonal spaces have historical importance in special relativity.

I would be curious in the intuition and the mathematical consequences when we relax these requirements. Especially appreciating testable predictions on the properties of the inner product.

Losing the Hermitian requirement corresponds to losing conjugate symmetry, and hopefully you've now seen what losing positive-definiteness does.

If you're studying $\langle x,y \rangle$ with $M$ Hermitian, you're studying quadratic forms. If you relax $M$ to any matrix, you're studying multilinear maps of two variables. Add in more variables (or not!), and you have a (covariant) tensor.

EDIT: In answer to some of the questions in the comments below:

When you say unique $M$, do you mean there exists only one matrix $M$? That would make there exist two matrix $M$ and identity matrix $I$ to write ⟨x,y⟩?

I mean, if you have two matrices $M$ and $N$ such that $y^\top M x = y^\top N x$ for all $x, y$ (i.e. they generate the same covariant tensor/multilinear map), then $M = N$. It's pretty easy to show this: just consider $x$ and $y$ as they range over the standard basis. By letting $y = e_j$ and $x = e_i$, then $y^* M x$ is the entry of $M$ in the $i$th row and $j$th column. Since this is the same for $N$, it's easy to see that the corresponding entry of $N$ agrees.

Moreover, $y^* M x$ is always an inner product (respectively, a multilinear map of two variables) when $M$ is Hermitian and positive definite (respectively, just a square matrix), so each inner product corresponds bijectively to positive definite matrices (and, of course, vice-versa). It just so happens that the dot product (or, in the complex case, complex Euclidean inner product) corresponds to the identity matrix.

Now I see it seems all forms of inner product has a kind of non-degeneracy requirement. Multilinear map does not have that requirement, so it seems to be what I want. Is there a mathematical connection between them? Inner product comes with very nice intuition about vector space, do we have a similar intuition with multilinear map?

A multilinear map is far more general! It includes all real inner products. It actually doesn't include complex inner products (I was a little wrong earlier); multilinear maps must satisfy $f(x, \lambda y) = \lambda f(x, y)$, without the conjugate. The natural generalisation are what are called sesquilinear forms, which satisfy the condition $f(x, \lambda y) = \overline{\lambda} f(x, y)$.

In either case, there isn't really a geometric intuition any more, now that symmetry is gone. You can no longer really think in terms of angles, when the "angle" from $x$ to $y$ is different from the "angle" from $y$ to $x$. There are plenty of other ways to think about these forms, given that they are just equivalent to square matrices, but thinking about them like inner products is not an option.

As a general guidance, can you tell me what properties I sacrifice or gain by each relaxation?

If you have a pseudo-inner product space, you lose the fact that $\langle x, x \rangle \ge 0$ for all $x$. It's possible to have $\langle x, x \rangle < 0$ and $\langle x, x \rangle = 0$ (i.e. self-orthogonal vectors). It means that, in order to define distance, you need to take absolute values: $\|x\| := \sqrt{|\langle x, x \rangle|}$.

There's no Cauchy-Schwarz, and no triangle inequality, in general, so this distance is not even a metric.

What you do keep is Gram-Schmidt. The non-degeneracy requirement is just strong enough to make sure Gram-Schmidt can be completed. Your orthonormal bases will have $\langle e_i, e_i \rangle = \pm 1$ for all $i$, and $\langle e_i, e_j \rangle = 0$ for $i \neq j$. The number of $+1$s and $-1$s is known as the signature of the space, and is independent of the specific orthonormal basis (it corresponds to the number of positive and negative eigenvalues of $M$).

For example, special relativity takes place in an affine space, whose corresponding space of displacements is a $4$-dimensional space whose signature consists of $3$ pluses and $1$ minus.

With a semidefinite inner product (i.e. $M$ is positive semidefinite), all we add is the possibility of zero-length vectors that are not the $0$ vector. You keep Cauchy-Schwarz and you keep triangle inequality. All that will happen is that you'll get a subspace of points whose length is $0$, that may or may not be the trivial subspace.

This subspace will be closed, and quotienting it out will yield a traditional inner product space. In fact, if $\|x - y\| = 0$, then Cauchy-Schwarz shows that $\langle z, x - y \rangle = 0$, i.e. $\langle z, x \rangle = \langle z, y\rangle$. So, if you form an equivalence relation of points that are $0$ distance apart, the inner product remains consistent.

So, you really don't lose much at all, in that case!