Weighting functions in the local polar system of coordinates.

geometric-topologygeometrygraph theorymanifolds

While I was reading the paper "Geometric deep learning on graphs and manifolds using mixture model CNNs", I didn't understand the figure of "patch operator weighting functions". Can someone explain me clearly how these red curves relates to graphs and manifolds? Thank you.here's the figure

Best Answer

Any differentiable manifold is locally homeomorphic to Euclidean space. In other words, if we select a point on the manifold, then over very small distances the manifold can be approximated by Euclidean space. It is then possible to parameterise the manifold with local polar coordinates $(\rho,\theta)$ which behave like polar coordinates in an infinitesimal region around the selected point.

The models GCNN, ACNN and MonNet each use a differentiable manifold parameterised by local polar coordinates. They have a weighting function, called the patch operator weighting function $w_i(\rho,\theta)$. Table $1$ in the paper gives $w_i(\rho,\theta)$ for ACNN and GCNN.

The red curves are $0.5$ level sets. That is to say, $w_i(\rho,\theta)=0.5$ along the red curves.


Edit: The OP asked about the definition of MoNet

In section 4, the paper mentions using a weighting function of the form $w_j({\bf{u}})=\exp\left(-\frac{1}2(\bf{\mu}-\bf{\mu}_j)^T\bf{\Sigma}_j(\bf{\mu}-\bf{\mu}_j)\right)$ with $\bf{\Sigma}_j$ and $\bf{\mu}_j$ learnable (formula 11 in the paper). $\bf{\Sigma}_j$ is restricted to being a diagonal matrix.

The paper then describes the neural network used to learn $\bf{\Sigma}_j$ and $\bf{\mu}_j$ and the procedure used to train it. The Adam method is explained by the following paper: https://arxiv.org/abs/1412.6980

LeNet used 2×2 max-pooling; in ChebNet and MoNet we used three convolutional layers, interleaved with pooling layers based on the Graclus method [16] to coarsen the graph by a factor of four.

For MoNet, we used polar coordinates u = (ρ,θ) of pixels (respectively, of superpixel barycenters) to produce the patch operator; as the weighting functions of the patch operator, 25 Gaussian kernels (initialized with random means and variances) were used. Training was done with 350K iterations of Adam method [25], initial learning rate 10−4, regularization factor 10−4, dropout probability 0.5, and batch size of 10.

Related Question