Solved – Graph-convolutional neural networks—how to determine feature matrix

conv-neural-networkgraph theorygraph-neural-network

I've been recently studying the GCNs—graph-convolutional neural networks. Given a triplet (A,F,C), where A is the network-adjacency matrix, F the network feature matrix and C the matrix of classes, corresponding to each node, GCNs are able to learn a representation of classes.

The original paper (Kipf et al. 2016, https://github.com/tkipf/gcn) uses Cora dataset and some others, where filters are predefined. My question is, how does one define the feature matrix? Where does this come from?

Thank you!

Best Answer

The feature matrix is defined by the features (variables) of the dataset you're using. Each row of your feature matrix corresponds to a node in your graph, with one column per feature. Let's use an example from the paper

Per your Cora dataset example, the paper states they used the "sparse bag-of-words feature vectors for each document" to create their feature matrix. So each individual research paper is a ${1 \text{ x V} }$ vector, where $\text{V}$ is the size of the vocabulary. Each entry in the bag-of-words vector is either $0$ or $1$ to indicate whether or not the word encoded at that index appears in the research paper.

With that in mind, the example GCN matrix triplet $\text{(A,F,C)}$ for the Cora dataset would be:

  • $\text{A}$ would be each research paper's citation links to other research papers (undirected links in the paper)
  • $\text{F}$ would be each research paper's bag-of-words vector
  • $\text{C}$ would be each research paper's class encoding (Cora has 7 mutually exclusive labels)

Hope this helps!

Related Question