Is the type $(1,1)$ Kronecker delta tensor, $\delta_a^{\,\,b}$ equal to the trace of the identity matrix or always $1$ when $a=b$ and zero otherwise

inner-productskronecker-deltamatricestensors

I'll ask this question using very simple examples working in flat cartesian space (just $2$ spatial dimensions). I'll be using the Einstein summation convention throughout this question, but since I'm very new to this I will explicitly write the summation symbol at times.

According to this article on raising and lowering indices on Wikipedia the identity matrix, can be represented as a Kronecker delta metric tensor (of type $(0,2)$), $$\delta_{ij}=\begin{pmatrix}1&0\\0&1\end{pmatrix}=\begin{cases}1 & \text{if} \, i=j \\ 0 &\text{if}\, i\ne j\end{cases}\tag{A}$$
and its' inverse of type $(2,0)$,
$$\delta^{ij}=\left({\delta_{ij}}\right)^{-1}=\begin{pmatrix}1&0\\0&1\end{pmatrix}=\begin{cases}1 & \text{if} \, i=j \\ 0 &\text{if}\, i\ne j\end{cases}\tag{B}$$
are just the $2$d identity matrices.

But how does one interpret a type $(1,1)$ Kronecker tensor metric, $\delta_a^{\,\,b}$?


Here are some examples to put this question into context. Suppose we have a matrix $A_{ij}=\begin{pmatrix}A_{11}&A_{12}\\A_{21}&A_{22}\end{pmatrix}$.

Some of the following are correct expressions for $A_{ij}$:

$$\delta_{ik}A^{k}_{\,\,j}\tag{1}$$
$$\delta_{ik}\delta_{j\ell}A^{k\ell}\tag{2}$$
$$\delta_{k\ell}\delta^{k\ell}A^{ij}\tag{3}$$

$(1)$ is correct (equal to $A_{ij}$) as the dummy index $k$ in the matrix is 'lowered' using the metric. Then according to the definition, $(\mathrm{A})$, $\delta_{ii}=1$ and $k$ is summed upon.

$(2)$ is also correct as

$$\delta_{ik}\delta_{j\ell}A^{k\ell}=\delta_{ik}A^k_{\,\,j}=A_{ij}$$

$(3)$ is not correct as
$$\delta_{k\ell}\delta^{k\ell}A^{ij}=\delta_k^{\,\,k}A^{ij}=\sum_{k=1}^2 \delta_k^{\,\,k}A^{ij}=2A^{ij}\ne A_{ij}\tag{C}$$

In the first equality of $(\mathrm{C})$, I think of this as 'raising' the second index of the first Kronecker metric in $\delta_k^{\,\,k}\delta^{kk}$. Since for a non-zero contribution $\ell=k$, which means $\delta^{kk}=1$, which is fine as this is what equation $(\mathrm{A})$ is telling me to do. So from this it seems like the $\delta_k^{\,\,k}$ is the trace of the matrix given in eqn $(\mathrm{A})$ (or $(\mathrm{B})$).


But I have another $2$ examples that seem to contradict this, suppose we have two column vectors, $U^i$ and $V^i$ along with their respective row vectors,

$$U^i=\begin{pmatrix}u_1\\u_2\end{pmatrix}\,\,\text{so that}\,\,\,\,U_i=\delta_{ij}U^j=\begin{pmatrix}u_1&u_2\end{pmatrix}$$
$$V^i=\begin{pmatrix}v_1\\v_2\end{pmatrix}\,\,\text{so that}\,\,\,\,V_i=\delta_{ij}V^j=\begin{pmatrix}v_1&v_2\end{pmatrix}$$
Now suppose we want to compute the inner product, $U\cdot V$. There are obviously many ways of doing this, and one could simply write $U\cdot V=U_iV^i$, but I want to purposely use the Kronecker delta metric to make a point. Here are some possible expressions for the inner product, $U\cdot V$:

$$\delta_{ij}U^iV^j\tag{4}$$
$$V^j\delta_{j\ell}U^{\ell}\tag{5}$$
$$U^aV_b\delta^b_{\,\,a}\tag{6}$$
$$U_iV^a\delta_a^{\,\,b}\delta_b^{\,\,i}\tag{7}$$

$(4)$ is correct as $\delta_{ij}U^iV^j=U_i\delta_{ij}V^j$, and the way I've understood this is that the $j$ index on the $V^j$ has been 'lowered' using the metric and the only non-zero contribution is when $i=j$, so writing these steps out explicitly,
$$U_i\delta_{ij}V^j=U_i\delta_{ii}V^i=U_iV^i$$ since $\delta_{ii}=1$ according to the prescription in $(\mathrm{A})$.

$(5)$ is also correct for the same reasons as $(4)$.

Now for eqn, $(6)$ this is where the problem starts for me, since I am not sure what $\delta^b_{\,\,a}$ actually means, I can only guess that $\delta^b_{\,\,a}$ is non-zero only when $a=b$, so from this I conclude that

$$U^aV_b\delta^b_{\,\,a}=U^aV_a\delta^a_{\,\,a}\stackrel{\color{red}{?}}{=}2U^aV_a\ne U^aV_a\tag{D}$$
For $(\mathrm{D})$, since we are working in $2$d flat Cartesian space, this $\delta_{\,\,a}^a=\sum_{a=1}^2\delta_{\,\,a}^a=2$ is the trace or sum of the diagonal elements of $(\mathrm{A})$, (as $a$ is a dummy index and hence summed over).

In a similar way, I think eqn. $(7)$ should be
$$U_iV^a\delta_a^{\,\,b}\delta_b^{\,\,i}=U_iV^a\delta_a^{\,\,i}\delta_i^{\,\,i}$$$$\stackrel{\color{red}{?}}{=}2U_iV^a\delta_a^{\,\,i}\stackrel{\color{red}{?}}{=}2U_iV^i\delta_i^{\,\,i}\stackrel{\color{red}{?}}{=}4U_iV^i\ne U_iV^i\tag{E}$$
In the first equality of $(\mathrm{E})$ there is a contraction of the $b$ index, where it get sets equal to $i$ since this is the only way to get a non-zero contribution out of the expression, I interpret this as the trace and that is where the factor of $2$ comes from in the second equality (marked with a red question mark above it).

I have then done exactly the same thing for the third equality and by my logic there is another trace so there should be another factor of $2$ which is in the fourth equality (these equalities are marked with red question marks as I'm not sure if these statements are true).

Now here is the problem, eqns $(4-7)$ are all correct expressions for the inner product, $U\cdot V$. So, the factors of $2$ I have introduced should not be there.

But the question is, why are these manipulations in $(6)$ and $(7)$ wrong, when eqn. $(\mathrm{C})$ used the trace as $\delta_k^{\,\,k}=2$?

Best Answer

Let $\{e_i\}$ be the canonical basis of $\Bbb R^n$. As a $(0,2)$ tensor, $(\delta_{ij})$ can be though of as the bilinear map $(x,y) \mapsto x^Ty$, which reads, in tensor notations, as $\sum_{i=1}^n {e_i}^*\otimes {e_i}^*$. As a $(2,0)$ tensor, $(\delta^{ij})$ can be though of as the bivector $\sum_{i=1}^n e_i\otimes e_i$. Finally, as a $(1,1)$ tensor, $({\delta_i}^j)$ can be though of as an endomorphism (a linear map), in this case the identity map, which reads $\sum_{i=1}^n {e_i}^*\otimes e_i$.

These notations become coherent once you define $e^i = {e_i}^*$. For example, a $(0,2)$ tensor $A=(A_{ij})$ is equal to $A=\sum_{ij} A_{ij} e^i\otimes e^j$, while a $(1,1)$ tensor $B=({B_i}^j)$ is equal to $B=\sum_{ij} {B_i}^j e^i\otimes e_j$, and a bivector $V=(V^{ij})$ is equal to $V=\sum_{ij}V^{ij}e_i\otimes e_j$.

In these last expressions, each index appears exactly twice: one time as a top index, and one time as a lower index. The summation convention says that as long as an index appears in this specific configuration, we can forget about the $\sum$ sign. Therefore, we would write, with the above notations, $A=A_{ij} e^i\otimes e^j$, $B={B_i}^je^i\otimes e_j$ and $V = V^{ij}e_i\otimes e_j$. It appears that with this convention, ${\delta_i}^i$ is the trace of the identity matrix $({\delta_i}^k)$, since the summation convention implies $$ {\delta_i}^i= \sum_{i=1}^n {\delta_i}^i = n = \operatorname{trace}I_n. $$ But it is not true (or at least, it is very confusing and misleading) that ${\delta_a}^b{\delta_b}^i = {\delta_a}^i{\delta_i}^i$, since this latter expression implies a summation over $i$ while the first does not. In fact, one should avoid this kind of expressions where an index appears three times. When $n=2$, the left hand side is equal to ${\delta_a}^1{\delta_1}^i + {\delta_a}^2{\delta_2}^i = {\delta_a}^i$, and still depends on $i$, while the right hand side is equal to ${\delta_a}^1{\delta_1}^1 + {\delta_a}^2{\delta_2}^2 = {\delta_a}^1 + {\delta_a}^2$, which only depends on $a$.

This specific problem appears, for instance, at (E), where you write $$ U_iV^a {\delta_a}^b{\delta_b}^i = U_iV^a{\delta_a}^i{\delta_i}^i, $$ whereas one should instead write $$ U_iV^a {\delta_a}^b{\delta_b}^i = U_iV^a{\delta_a}^i. $$ Indeed, there is no need to add the last term ${\delta_i}^i$ in this expression, since you are using the equality ${\delta_a}^b{\delta_b}^i = {\delta_a}^i$. I think this is the main confusion in your question and it appears several times (you write $V_b{\delta_a}^b=V_a{\delta_a}^a$ instead of $V_b{\delta_a}^b= V_a$ in (D), etc.) Note that you have the following equalities \begin{align} U_iV^a{\delta_a}^b{\delta_b}^i &= (U_i{\delta_b}^i)(V^a{\delta_a}^b) = U_bV^b,\\ U_iV^a{\delta_a}^b{\delta_b}^i&=U_iV^a({\delta_b}^i{\delta_a}^b) = U_iV^a {\delta_a}^i = (U_i{\delta_a}^i)V^a = U_aV^a,\\ U_iV^a{\delta_a}^b{\delta_b}^i&=U_iV^a({\delta_b}^i{\delta_a}^b) =U_iV^a {\delta_a}^i= U_i(V^a{\delta_a}^i) = U_iV^i. \end{align} and thus, the result does not depend on the order of the different contractions.

For what it's worth, let me add that I personally do not use this convention, and more generally, do note use computations in coordinates. This is a matter of taste (and also a cultural thing), but as you can see, this can sometimes be misleading. When completely mastered, this way of doing computations is really powerful, but if you're not comfortable with it (like me), you are more likely to do a lot of errors.