[Math] Bounding the von Neumann entropy of a density matrix with the Hilbert-Schmidt norm

it.information-theorylinear algebraquantum mechanicsquantum-computation

Question

Suppose I have a $D$-dimensional density matrix $\rho_0$

$\rho_0^\dagger = \rho_0 \quad, \quad \mathrm{Tr} \rho_0 = 1 \quad, \quad \rho_0 > 0,$

with a known spectrum $\{\lambda_i^0\}$ and von Neumann entropy

$H_0 = – \sum_{i=1}^D \lambda_i^0 \ln \lambda_i^0 $.

Now we look at the perturbed density matrix $\rho = \rho_0 + \sigma$, where $\sigma$ need not be positive. Suppose we have a bound on the size of the Hilbert-Schmidt norm of the perturbation

$\|\sigma\|_{\mathrm{HS}} = \|\rho – \rho_0 \|_{\mathrm{HS}} \le \epsilon $

where

$ \|\sigma \|_{\mathrm{HS}}^2 = \sum_k \sum_{k'} | \sigma_{k,k'} |^2 = \sum_k \sum_{k'} | \langle e_k , \sigma \; e_{k'} \rangle |^2$

for any basis $e_k$.

What bound can we put on the perturbation in entropy

$\Delta H = |H – H_0|$

in terms of both $\epsilon$ and the spectrum $\{\lambda_i^0\}$?


Prior Art

To demonstrate the continuity of the entropy, Fannes established an upper bound on the entropy perturbation in terms of the trace norm

$ T = \frac{1}{2}\| \sigma \|_1 = \frac{1}{2} \sum_k \sum_{k'} | \sigma_{k,k'} | = \frac{1}{2}\sum_k \sum_{k'} | \langle e_k , \sigma \; e_{k'} \rangle |$.

Importantly, it was for two arbitrary density matricies, in the sense that the bound did not depend on a known spectrum of $\rho_0$ (just on $T$ and $D$). This was subsequently improved to the optimal inequality by Audenaert:

$|H – H_0| < T \; \log (D-1) + H_2\; [T,1-T], $

where

$H_2\; [T,1-T] = -T \; \log T – (1-T) \log (1-T)$

is the binary entropy. (See [Wikipedia][1].)

However, both Fannes and Audenaert's proofs involve breaking the perturbation into positive and negative parts

$\sigma = \sigma_+ – \sigma_- , $

where $ \sigma_+, \sigma_- > 0$. (Actually, Audenaert first reduces the problem to classical probability distributions, and then breaks the probability perturbations into positive and negative parts, which is the same thing.) As far as I can tell, this is only useful when working with a 1-norm, not a 2-norm, so the two proofs don't offer me much guidance. In addition, neither takes advantage of the fact that we're working from a known matrix $\rho_0$; they only depend on the trace distance $T$ and the dimension $D$.

Now, one can just naively use with worst-case bound between the 1-norm the 2-norm

$T = \frac{1}{2}\| \sigma \|_1 \le \frac{1}{2} \sqrt{D} \| \sigma \| _{\mathrm{HS}} $

It turns out that this is sufficient for my purposes when $\rho_0$ is the maximally mixed matrix $I_D / D$, but I need a tighter bound for other $\rho_0$. In other words, I need a bound which depends on the spectrum of $\rho_0$ (growing tighter with less mixed $\rho_0$).

Probably Unnecessary Details

If it matters, the density matrix $\rho_0$ that I am working with can be expressed as

$\rho_0 = \eta^{\otimes N}$

where $\eta$ is two-dimensional and has eigenvalues $\{a, 1-a\}$. This means that $D= 2^N$ and $\rho_0$ has a spectrum of the form

$\mathrm{spec}(\rho_0) = \{a,1-a\}^{\times N} = \{a^N, a^{N-1}(1-a), \ldots, (1-a)^N \}$.

The bound I need must decrease with $N$:

$\lim_{N\to \infty} |\Delta H| = 0$

If it does, it will almost surely decrease exponentially in $N$. Currently, I am able to show that the Hilbert-Schmidt norm of my perturbation falls like

$\|\sigma\|_{\mathrm{HS}}^2 \sim [a^2 + (1-a)^2]^{(1+\delta)N} $

for small $\delta > 0$. If $a=1/2$, then $\rho_0$ is maximally mixed and

$||\sigma||_{\mathrm{HS}}^2 \sim \frac{1}{2^{(1+\delta)N}} $

so

$|\Delta H| \sim T \; \log(D-1) + T – T \; \log T \sim \frac{\sqrt{D} \ln{D}}{\sqrt{2^{(1+\delta) N}}} = \frac{N \ln 2}{\sqrt{2^{\delta N}}} \to 0$.

But if $a < 1/2$, the bound on $\|\sigma\|_{\mathrm{HS}}^2$ falls more slowly with $N$ and the naive application of the Fannes–Audenaert inequality gives a bound on the entropy which grows with $N$ (for sufficiently small $\delta$).

Best Answer

I can give you a simplification of the problem and a precise answer in the limit as $\epsilon \to 0$. This precise answer also yields a good upper bound in the upward direction $H > H_0$, using the fact that entropy is a concave function. In the downward direction $H < H_0$ things are more annoying, and a good answer depends on the size of $\epsilon$ and what type of estimate you want.

There is a convex body $B$ of density matrices $\rho$ and a map from that to the much simpler convex body $S$ of unordered spectra $\vec{\lambda}$. The simplification of your problem is that the Hilbert-Schmidt metric on density matrices descends to the Euclidean metric (or $\ell^2$ metric or Hilbert-Schmidt metric) on spectra. $S$ is a simplex, in fact the quotient of a regular simplex $T$ by its isometries. Since $H(\rho)$ only depends on its spectrum, you might as well work in $S$ or $T$ rather than in $B$. In fact $T$ is exactly the convex body of classical states on $D$ configurations rather than quantum states. So the question is not really quantum at all, it is a question about the Shannon entropy $H(\vec{\lambda})$ of distributions $\vec{\lambda}$ on a set with $D$ elements.

It is easy to take the gradient of $H(\vec{\lambda})$. The differential is just $$\sum_k -(\ln(\lambda_k) + 1)d\lambda_k.$$ Since the sum of the coordinates is constant, you can drop the second term of the summand. So to maximize the deviation of $H(\vec{\lambda})$, you should push $\vec{\lambda}$ in the direction $$\delta \lambda_k = -\ln(\lambda_k)+\frac1D \sum_k \ln(\lambda_k).$$ The norm of the gradient is then an optimal upper bound as $\epsilon \to 0$. As mentioned at the beginning, if the entropy is increasing, this derivative bound holds for any $\epsilon$, because entropy is concave. In the other direction you are "falling off a cliff" instead of "climbing to the top of the dome", and I would have to understand more about what sort of bound you want. (Maybe only because I haven't grasped all of the later details of your question.)

For the uniform state (the maximally mixed state) I can conjecture a precise answer in the down direction. Then you are in the center of the simplex $T$. I conjecture that the way to decrease entropy as much as possible is to run straight for a corner, i.e., increase one $\lambda_k$ and keep the others equal.