Difference between Symmetrically normalized Laplacian matrix versus graph laplacian matrix

conv-neural-networkgraph theorymatrixsymmetry

I am trying to understand the graph laplacian matrix in Graph Convolution networks.

To get a basic understanding of graph laplacian matrix I am referring to this

https://mbernste.github.io/posts/laplacian_matrix/

However, the two definitions are different. What is the difference? are they the same? What does symmetric normalization mean here?

$L \neq D – A$

The graphs we are dealing with here are taxi demand at different regions of a city at different points in time.

Thanks.

Best Answer

In spectral graph theory, there are several different types of Laplacian matrices.

The Laplacian: $$ L^u = D - A $$ is also called the unnormalized graph Laplacian.

On the other hand, the Laplacian $$ L^s = \mathbf 1 - D^{-1/2}AD^{-1/2} $$ is often called the symmetric normalized graph Laplacian.

Those two matrices are usually not the same. $L^s$ is called symmetric because it is a symmetric matrix, i.e. $L^s_{ij} = L^s_{ji}$. This can easily be seen by showing that it is its own transpose: $L^s = (L^s)^t$: $$ \begin{align} (L^s)^t &= \left(\mathbf 1 - D^{-1/2}AD^{-1/2}\right)^t\\ &= \mathbf 1^t - \left(D^{-1/2}AD^{-1/2}\right)^t\\ &= \mathbf 1 - \left(D^{-1/2}\right)^t A^t \left(D^{-1/2}\right)^t\\ &= \mathbf 1 - D^{-1/2}AD^{-1/2}\\ &= L^s. \end{align} $$ Furthermore, it is called normalized in the following sense: Different nodes have different degrees (the diagonal entries of the matrix $D$), and those with large degrees "dominate" the matrix $A$ which is undesirable in certain situations, so one wants to reduce this dominance, and this is called "normalization". This is done as follows. First, $L^s=D^{-1/2}\;L^u\;D^{-1/2}$, because: $$ \begin{align} L^s &= 1 - D^{-1/2}AD^{-1/2}\\ &= D^{-1/2}DD^{-1/2} - D^{-1/2}AD^{-1/2}\\ &= D^{-1/2}\left(D - A\right)D^{-1/2}\\ &= D^{-1/2} L^u D^{-1/2} \end{align} $$ And this transformation of multiplying $D^{-1/2}$ from the left and from the right is exactly the normalization that we want. That's because this way all the rows and columns in the adjacency matrix $A$ that involve a high-degree node $x$, say $deg(x) = d, d\gg 1$, get thus divided by a "small" number $\frac{1}{\sqrt d}$.

Best Answer

Related Solutions

Solved – Why do Bayesian Networks use acyclicity assumption

Solved – Comparing undirected weighted graphs

Related Question