Integral of a generalized Rayleigh quotient

linear algebrareal-analysis

In my research I've recently stumbled upon integrals of the form:
$$
\int_{S^2} \frac{x^T B x}{x^T A x} \,dx
$$
where $A$ is a symmetric positive-definite $3\times 3$ matrix and $B$ is just a symmetric $3\times 3$ matrix.

I was wondering whether anything was known about these types of integrals?

Best Answer

$\newcommand\E{\mathsf E}$You want to find $$E:=\E\frac{X^T BX}{X^T AX},$$ where $X$ is a random vector uniformly distributed on $S^2$, so that $X$ equals $$\frac{[G_1,G_2,G_3]^T}{\sqrt{[G_1,G_2,G_3]^T[G_1,G_2,G_3]}}$$ in distribution, where $G_1,G_2,G_3$ are independent standard normal random variables.

Also, by the spherical symmetry, without loss of generality the matrix $A$ is diagonal. So, for some real $b_{i,j}$'s and some positive real $a_i$'s, $$E=\E\frac{\sum_{j,k=1}^3 b_{j,k}G_jG_k}{\sum_{i=1}^3 a_i G_i^2}=\sum_{j,k=1}^3 b_{j,k}\E R_{j,k},$$ where $$R_{j,k}:=\frac{G_jG_k}{\sum_{i=1}^3 a_i G_i^2}.$$ Note that $R_{j,k}$ will turn into $-R_{j,k}$ if $j\ne k$ and $G_j$ is replaced by $-G_j$. So, if $j\ne k$, then the distribution of $R_{j,k}$ is symmetric and hence $\E R_{j,k}=0$, So, $$E=\sum_{j=1}^3 b_jE_j,$$ where $b_j:=b_{j,j}$ and $$E_j:=\E R_{j,j}=\E\frac{G_j^2}{\sum_{i=1}^3 a_i G_i^2}.$$

So, the problem is equivalent to computing $E_j$. However, it is highly unlikely that $E_j$ can be expressed in terms of elementary or even special functions, even for specific values of the $a_i$'s. For instance, Mathematica cannot do anything for $E_j$ even when $j=1$, $a_1=1$, $a_2=2$, and $a_3=3$:

However, we have this:

Related Solutions

[Math] Convergence of Eigenvalues

I think this book is suitable for further study about this question:

"Representation Theory of Finite Groups An Introductory Approach" "Benjamin Steinberg"

This book has some chapters related to your question.

[Math] The derivative of the Cholesky factor

The derivative can be found via implicit differentiation. That is, $$ \frac{\mathrm{d}\operatorname{vec}\left(Y\right)}{\mathrm{d}\operatorname{vec}\left(X\right)} = \left(\frac{\mathrm{d} \operatorname{vec}\left(X\right)}{\mathrm{d}\operatorname{vec}\left(Y\right)}\right)^{-1}.$$ It is relatively easy to compute the derivative of $A$ with respect to $f(A)$ since $A = f(A)f(A)^{\top}$. The only trick part is restricting $f(A)$ to be lower triangular.

For general $X$, we have $$ \frac{\mathrm{d} \operatorname{vec}\left(XX^{\top}\right)}{\mathrm{d} \operatorname{vec}\left(X\right)} = \left(I + K\right)\left(X\otimes I\right),$$ where $K$ is the Commutation Matrix.

Now to get the derivative with respect to the $\operatorname{vech}$ requires use of the chain rule. This gives $$ \frac{\mathrm{d} \operatorname{vech}\left(XX^{\top}\right)}{\mathrm{d} \operatorname{vech}_{\Delta}\left(X\right)} = L \left(I + K\right)\left(X\otimes I\right) D,$$ where here $L$ is the elimination matrix, and $D$ is the "lower triangular duplication matrix" which has the property that $D \operatorname{vech}\left(M\right) = \operatorname{vec}\left(M\right)$ for lower triangular matrices $M$. The sought derivative is the matrix inverse of the above expression.

numerical confirmation:

Here is a numerical confirmation in R: (note that the chol function in R is an operator from upper triangular matrices to upper triangular matrices, thus some mucking about with transposes):

require(matrixcalc)
set.seed(2349024)
n <- 6
X <- cov(matrix(rnorm(1000*n),ncol=n))
fnc <- function(X) t(chol(X))

Y <- fnc(X)
d0 <- (diag(1,nrow=n^2) + commutation.matrix(r=n)) %*% (Y %x% diag(1,nrow=n))
L <- elimination.matrix(n)
d1 <- L %*% d0 %*% t(L)
dfin <- solve(d1)

# now compute the approximate derivative
apx.d <- matrix(rep(NA,length(dfin)),nrow=dim(dfin)[1])
my.eps <- 1e-6
low.idx <- which(lower.tri(diag(1,n),diag=TRUE))
for (iii in c(1:length(low.idx))) {
    Xalt <- X
    tweak <- low.idx[iii]
    Xalt[tweak] <- Xalt[tweak] + my.eps
    # "Note that only the upper triangular part of 'x' is used..."
    Yalt <- fnc(t(Xalt))
    dY <- (Yalt - Y) / my.eps
    apx.d[,iii] <- dY[low.idx]
}
apx.error <- apx.d - dfin
max(abs(apx.error))
apx.error

The maximum absolute error I get is 5.606e-07, on the order of the delta in the input variable, 1e-06.

Best Answer

Related Solutions

[Math] Convergence of Eigenvalues

[Math] The derivative of the Cholesky factor

numerical confirmation:

Related Question