Unnormalized Laplacian serve in the approximation of the minimization of RatioCut, while normalized Laplacian serve in the approximation of the minimization of NCut.
Basically, in the unnormalized case, you optimize an objective relative to the number of nodes in each cluster.
And, in the normalized case, you optimize the objective relative to the volume of each cluster.
The square root comes from this: $f^\top(I-D^{-1}A)f = g^\top(I-D^{-1/2}AD^{-1/2})g$ where $g=D^{1/2}f$.
So, if your optimization method relies on a symmetric PSD matrix, you use the symmetric normalized Laplacian, and then remove the bias by computing $D^{-1/2}g$ to recover $f$.
More insight about RatioCut and NCut.
Let $G=(V,E)$ be a graph with vertex set $V=\{v_1,\dots,v_n\}$ and edge set $E$, $w_{ij}$ denote the positive weight on the edge between $v_i$ and $v_j$. Let $\{C_i:1 \le i \le k\}$ be a disjoint partition of $V$. Define
$$
Cut(C_i:1\le i\le k)\triangleq\frac{1}{2}\sum_{c=1}^k \sum_{i \in C_c,j\in \bar C_c}A_{ij}
$$
$$
RatioCut(A_i:1\le i \le k)\triangleq\sum_{i=1}^k\frac{Cut(A_i,\bar A_i)}{|A_i|}
$$
$$
NCut(A_i:1\le i \le k)\triangleq\sum_{i=1}^k\frac{Cut(A_i,\bar A_i)}{vol(A_i)}
$$
RatioCut:
Let
$$\tag{1}\label{1}
f_i=\begin{cases}\sqrt{|\bar A|/|A|} & \text{if } v_i\in A\\\sqrt{|A|/|\bar A|} & \text{if } v_i \in \bar A\end{cases}
$$
\begin{align*}
f^\top L f & = \frac{1}{2}\sum_{i,j=1}^n w_{ij}(f_i-f_j)^2 \\
& = \frac{1}{2}\sum_{i\in A,j\in \bar A}w_{ij}\left(\sqrt{\frac{|\bar A|}{|A|}}+\sqrt{\frac{|A|}{|\bar A|}}\right) + \frac{1}{2}\sum_{i\in \bar A,j\in A}w_{ij}\left(-\sqrt{\frac{|\bar A|}{|A|}}-\sqrt{\frac{|A|}{|\bar A|}}\right) \\
&=Cut(A,\bar A)\left(\frac{|\bar A|}{|A|}+\frac{|A|}{|\bar A|}+2\right)\\
&=Cut(A,\bar A)\left(\frac{|A|+|\bar A|}{|A|}+\frac{|A|+|\bar A|}{|\bar A|}\right)\\
&=|V| RatioCut(A,\bar A)
\end{align*}
You can also see that $\sum_{i=1}^n f_i=0$, so $f\perp \mathbb 1$.
So minimizing RatioCut is equivalent to the following problem:
$$
\min_{A\subset V}f^\top L f\text{ subject to $f\perp\mathbb 1$, $f_i$ as defined in Eq.\eqref{1}},\|f\|^2=n
$$
NCut:
For this case, define
$$\tag{2}\label{2}
f_i=\begin{cases}
\sqrt{\frac{vol(\bar A)}{vol(A)}} &\text{if $v_i\in A$}\\
\sqrt{\frac{-vol(A)}{vol(\bar A)}} &\text{if $v_i\in \bar A$}\\
\end{cases}
$$
Similar to above, we have $Df\perp \mathbb 1$, $f^\top D f=vol(V)$ and $f^\top L f=vol(V)NCut(A,\bar A)$.
Minimizing NCut is equivalent to
$$
\min_{A\subset V} f^\top L f \text{ subject to $f$ as in Eq.\eqref{2}, $Df\perp \mathbb 1$ and $f^\top Df=vol(V)$}
$$
Substituting $g=D^{1/2}f$ we get
$$
\min_{A\subset V} g^\top D^{-1/2}LD^{-1/2} g \text{ subject to $g=D^{1/2}f$, $f$ as in Eq.\eqref{2}, $g\perp D^{1/2}\mathbb 1$ and $\|g\|^2=vol(V)$}
$$
Then observe that $D^{-1/2}LD^{-1/2}$ is the symmetric normalized Laplacian.
I can make two sort of hand-wavy observations. From basic linear algebra properties we know that given a matrix $ \mathbf{L} $
\begin{equation}
\sum_{i}d_{i}= \sum_{i}\lambda_{i}
\end{equation}
where $ \lambda $ are the eigenvalues, and $ d $ are the the diagonal elements of your matrix $ \mathbf{L} $.
The elements on the diagonal of the Laplacian correspond to the degrees of the vertices. Assuming no isolated vertices and that the graph is simple (i.e weights are either $0$ or $1$), $ d_{i} \geq 1 $.
If most of your eigenvalues are equal to $1$, then their sum is also likely a small number. That means that the sum of degrees is also a small number, which means that your graph is probably sparsely connected.
The next observation has to do with star graphs. In general, complete bipartite graphs with $n$ and $m$ vertices on each group respectively, have Laplacian eigenvalues:
- $n$ with multiplicity $m-1$
- $m$ with multiplicity $n-1$
- $0$ with multiplicity $1$
- $m+n$ with multiplicity 1
Consider the star graph below which is a complete bipartite graph with $n=1$ and $m=8$
Its eigenvalues are $1$ with multiplicity $7$, $0$ with multiplicity $1$ and $9$ with multiplicity $1$.
So, if your graph contains starlike subgraphs, then your Laplacian is going to have a bunch of $1$-eigenvalues. I include two examples below (the random isolated component belongs to the graph on the right).
.
Both of these graphs have eigenvalue $1$ with multiplicity $9$.
Best Answer
They're all the same. Different communities use different names. In graph theory, it's always called "adjacency matrix" in unweighted graphs, and sometimes the "weight matrix" if the graphs are weighted. "Affinity" and "similarity" are sometimes used in data science when the weights are computed using some similarity score between the points in a point cloud data set.