Robust Graph Convolutional Networks Against Adversarial Attacks

This blog post summarizes and reviews the paper "Robust Graph Convolutional Networks Against Adversarial
Attacks" by Zhu, Dingyuan/Zhang, Ziwei/Cui, Peng/Zhu, Wenwu from 2019 [1]

Introduction

Motivation

A lot of data in the real-world has an underlying graph structure. In contrast to classical Convolutional Networks which need Euclidian data as input, Graph Neural Networks (GNNs) are applied on graph data. They can be used for example to classify molecules based on their structure, classify nodes in social or financial networks or determine the role of a protein in a biological interaction graph [2]. Graph Convolutional Neural Networks (GCNs) as a special type of GNNs have achieved high performances in graph-learning tasks.

Fig. 1: Partial citation network showing PubMed publications on microRNAs from 2001-2016 [3].

Problem Statement

Great, so what's the problem? It has been shown that GCNs are sensitive to adversarial attacks [6,7]. In the context of graph data, adversarial attacks are small perturbations which aim to reduce the network's performance. This can especially be problematic in applications where you need your model to produce reliable results like in the medical or financial sector.

In this paper the model's task is to classify nodes in citation networks, where nodes represent documents and (non-directed) edges represent citations between documents. The initial node features are given as bag-of-words vectors indicating the presence of certain keywords in the corresponding document. The predicted output class shows the overall subject a document belongs to (for example 'probabilistic methods'). A possible adversarial attack could then be to implement small changes in the graph structure such that a certain document (=target node) is misclassified (see Fig. 2). We can differentiate between different types of attack [4]:

Poisoning: The attacks happens before training, i.e., the training data is changed in a way that leads to errors in the later classification task.
Evasion: The attack happens after training. We provide fake samples to the trained model, which lead to decreased performance/misclassifications.
Targeted: The goal is to misclassify a target node. This can be reached either by manipulating the target node directly (edges, features) or indirectly by manipulating other nodes, but not the target node ("targeted influence attack").
Non-targeted: In this case, we do not have a certain target, but want to decrease the overall performance of the model.

Fig. 2: Example of a targeted indirect poisoning attack that leads to misclassification [4]

But how can we make GCNs more robust against adversarial attacks?

Other Works

In the past two years there has been a variety of works released regarding defense mechanisms for GNNs/GCNs, most of them focussing on node classification. Common "attack-agnostic" (robustness against a variety of possible attacks, like the strategy in this paper) strategies comprise adversarial training and heuristic assumptions on attack strategies [8]. A well structured literature overview regarding adversarial attacks on graph data/graph adversarial learning and different defense approaches can be found in [8] as well as in a regularly updated github repository by the same authors'.

The Authors' Approach

The authors propose a GCN framework with the goal to gain robustness against adversarial attacks while not compromising the model's performance. They take two major measures in order to make the model less sensitive to adversarial attacks:

They use gaussian distributions as feature representations of nodes instead of vectors.
They implement an attention mechanism that decreases the influence of attacked nodes on other nodes in the graph. The underlying assumption is that feature representations of attacked nodes (gaussian distributions) have higher prediction uncertainties and therefore higher variances. Subsequently, the weights of nodes with higher variances are decreased.

How this looks in detail is shown in the following section.

Methodology

Framework of Robust Graph Convolutional Network (RGCN)

Let us forget about the new framework for a second and have a look at the forward pass of a well known GCN using a message-passing framework and a feature vector [4]. From there we can easily see what the new framework does differently. If you are already familiar with this, you can skip to the following paragraph.

A simple layer-wise forward pass (matrix notation) in a convolutional neural network with message-passing can be defined as

(1)	$\begin{array}{l}\displaystyle \mathbf{H}^{(l+1)}=\rho\left(\tilde{{D}}^{-\frac{1}{2}} \tilde{\mathrm{A}} \tilde{D}^{-\frac{1}{2}} \mathbf{H}^{(l)} \mathbf{W}^{(l)}\right)\end{array}$

with $\begin{array}{l}A\end{array}$ being the adjacency matrix (information about connection between nodes) and $\begin{array}{l}H^{(l)}\end{array}$ being the output from the previous layer. The multiplication of $\begin{array}{l}A\end{array}$ and the input $\begin{array}{l}H^{(l)}\end{array}$ can be interpreted as information aggregation from node neighbours (therefore "message-passing") and simply means summing up the feature vectors of all neighbouring nodes. To regard the own features of each node we can add the identity matrix to A (→ self-connections). We multiply our inputs with some learnable weights $\begin{array}{l}W^{(l)}\end{array}$ and feed everything into a non-linear activation function $\begin{array}{l}\rho\end{array}$ (for example ReLU). To avoid unwanted scaling of the feature matrix $\begin{array}{l}A\end{array}$ we normalize it symetrically: $\begin{array}{l}\tilde{{D}}^{-\frac{1}{2}} \tilde{\mathrm{A}} \tilde{D}^{-\frac{1}{2}}\end{array}$ . Otherwise feature values would be in different ranges depending on the nodes connectivity, such that better connected nodes would have higher feature values. $\begin{array}{l}\tilde{D}\end{array}$ describes the diagonal node degree matrix (self-connections included) containing the number of each node's connections. If you wonder why we use symmetric normalization instead of just multiplying by $\begin{array}{l}\tilde{D}^{-1}\end{array}$ , check out this blog.

Well, but what's new?

Gaussian-based Graph Convolutional Layers (GGCL)

Fig. 3: Framework of proposed RGCN model [1]

Let us check out the first difference. Instead of using feature vectors in the hidden layers as usual, features will after the first layer be represented as gaussian distributions (until sampling). Unlike the following layers the first layer is a fully connected layer, which takes feature vectors as input and outputs gaussian distributions, i.e. the matrices $\begin{array}{l}\mathbf{M}^{(1)}=\rho\left(\mathbf{H}^{(0)} \mathbf{W}_{\mu}^{(0)}\right), \Sigma^{(1)}=\rho\left(\mathbf{H}^{(0)} \mathbf{W}_{\sigma}^{(0)}\right) .\end{array}$ These matrices consist of mean and variance vectors of all nodes. Looking at Fig. 3, after the first layer a feature is represented by one yellow (mean) plus one red (variance) circle forming one gaussian distribution per feature. When sampling from these distributions we would get similar feature vectors most of the time (depending on the variances). The latent representation $\begin{array}{l}\mathbf{h}_{i}^{(l)}\end{array}$ of one node $\begin{array}{l}v_i\end{array}$ is then denoted as

(2)	$\begin{array}{l}\displaystyle \mathbf{h}_{i}^{(l)}=\mathcal{N}\left(\boldsymbol{\mu}_{i}^{(l)}, \operatorname{diag}\left(\boldsymbol{\sigma}_{i}^{(l)}\right)\right)\end{array}$

with $\begin{array}{l}\mu_{i}^{(l)}\end{array}$ being the mean vector and $\begin{array}{l}\operatorname{diag}\left(\boldsymbol{\sigma}_{i}^{(l)}\right)\end{array}$ the diagonal variance matrix. For all nodes we get $\begin{array}{l}\mathbf{M}^{(l)}=\left[\mu_{1}^{(l)}, \ldots, \mu_{N}^{(l)}\right]\end{array}$ and $\begin{array}{l}\Sigma^{(l)}=\left[\sigma_{1}^{(l)}, \ldots, \sigma_{N}^{(l)}\right]\end{array}$ .

Given this representation the forward pass of a GGCL looks like this:

(3)

$\begin{gathered} \mathbf{M}^{(l+1)}=\rho\left(\tilde{\mathbf{D}}^{-\frac{1}{2}} \tilde{\mathbf{A}} \tilde{\mathbf{D}}^{-\frac{1}{2}}\left(\mathbf{M}^{(l)} \odot \mathcal{A}^{(l)}\right) \mathbf{W}_{\mu}^{(l)}\right) \\ \Sigma^{(l+1)}=\rho\left(\tilde{\mathbf{D}}^{-1} \tilde{\mathrm{A}} \tilde{\mathbf{D}}^{-1}\left(\Sigma^{(l)} \odot \mathcal{A}^{(l)} \odot \mathcal{A}^{(l)}\right) \mathbf{W}_{\sigma}^{(l)}\right) \end{gathered}$

Since it is mathematically not possible to apply the forward pass like shown in (1) directly to a gaussian distribution, it is applied to both the mean matrix of and the variance matrix separately. And what is $\begin{array}{l}\mathcal{A}^{(l)}\end{array}$ ? $\begin{array}{l}\mathcal{A}^{(l)}\end{array}$ implements the proposed variance-based attention mechanism.

Variance-based Attention Mechanism

Message-passing between nodes is normally a nice characteristic since it takes dependencies between nodes into account, but when applying adversarial attacks it can be a problem, because attacked nodes can influence their neighbours. Therefore we want to minimize the negative effects on the other nodes in the graph. But how do we know a node has been attacked? Attacked nodes tend to have larger variances in their hidden representation. As a logical consequence the impact of nodes with larger variances should be reduced accordingly. This can be achieved by elementwise multiplication with the matrix $\begin{array}{l}\mathcal{A}^{(l)}\end{array}$ , defined as

(4)	$\begin{array}{l}\displaystyle \mathcal{A}^{(l)}=\exp \left(-\gamma \Sigma^{(l)}\right).\end{array}$

You can easily see that the entries in $\begin{array}{l}\mathcal{A}^{(l)}\end{array}$ and thus the impact of attacked node features becomes smaller for higher variances.

Fig. 4: Variances of attacked nodes, plot from result section [1]

Loss Function

In order to calculate the loss we first need to sample feature vectors from our last gaussian feature representation, i.e. for one node we have

(5)	$\begin{array}{l}\displaystyle \mathrm{z}_{i} \sim \mathcal{N}\left(\mu_{i}^{(L)}, \operatorname{diag}\left(\sigma_{i}^{(L)}\right)\right)\end{array}$

We pass our samples $\begin{array}{l}\mathbf{Z}=\left[\mathbf{z}_{1}, \ldots, \mathbf{z}_{N}\right]\end{array}$ to a softmax function and calculate our the predicted class $\begin{array}{l}\hat{\mathbf{Y}}\end{array}$ (= subjects assigned to documents). The sample $\begin{array}{l}\epsilon \sim \mathcal{N}(0, \mathrm{I})\end{array}$ is used for the reparameterization trick.

Now we calculate our loss:

(6)	$\begin{array}{l}\displaystyle \mathcal{L}=\mathcal{L}_{\text {cls }}+\beta_{1} \mathcal{L}_{\text {reg } 1}+\beta_{2} \mathcal{L}_{\text {reg } 2}\end{array}$

The loss function consists of the cross entropy loss $\begin{array}{l}\mathcal{L}_{\text {cls }\end{array}$ (in our case for semi-supervised multiclass classification) and two weighted regularization terms.

The first regularization term is necessary to make sure that the learned representations in the first layer really are gaussian distributions:

(7)	$\begin{array}{l}\displaystyle \mathcal{L}_{\text {reg } 1}=\sum_{i=1}^{N} K L\left(\mathcal{N}\left(\boldsymbol{\mu}_{i}^{(1)}, \operatorname{diag}\left(\boldsymbol{\sigma}_{i}^{(1)}\right)\right) \\| \mathcal{N}(\mathbf{0}, \mathbf{I})\right)\end{array}$

The KL-divergence $\begin{array}{l}K L(\cdot \|)\end{array}$ measures the difference between two gaussian distributions, such that minimizing the loss forces the feature representation into the form of a gaussian distribution. We only need this regularization for the first layer since the following layers already are gaussian distributions by definition. Furthermore L2-regularization is applied to the weight matrices as follows:

$\begin{array}{l}\mathcal{L}_{\text {reg } 2}=\left\|\mathbf{W}_{\mu}^{(0)}\right\|_{2}^{2}+\left\|\mathbf{W}_{\sigma}^{(0)}\right\|_{2}^{2} \text { . }\end{array}$

Experimental Setup

Baselines and Attack Methods

In order to test the framework and evaluate its effectiveness against adversarial attacks the following experimental settings are introduced. Two different state-of-the-art GCN models will be used for comparison on both clean datasets (no attacks, to prove that effectiveness is not compromised) and against adversarial attacks:

GCN from [4]: well known GCN architecture

GAT [5]: improved GCN, assigns different node weights to neighbours

To evaluate the robustness against adversarial attacks the following attack methods have been chosen:

Random Attack: randomly insert fake edges into the graph, non-targeted
RL-S2V: attack designed for evasion and targeted attacks, performs direct attacks
NETTACK: targeted attack, influence (NETTACK-In) or direct attack (NETTACK-Di), here used for poisoning

The general focus lies on changing the graph structure.

Datasets

The authors apply their model on the three datasets (citation networks) shown below:

$\begin{array}{l}\begin{array}{|c|c|c|c|} \hline & Cora & Citeseer & Pubmed \\ \hline \#nodes & 2,708 & 3,327 & 19,717 \\ \hline \#edges & 5,429 & 4,732 & 44,338 \\ \hline \#classes & 7 & 6 & 3 \\ \hline \#features & 1,433 & 3,703 & 500 \\ \hline \end{array}\end{array}$

Fig. 5: Properties of used datasets [1]

The number of layers is set to two for all experiments.

Results

Performance on Clean Data

Node classification on clean (not attacked) datasets shows that the RGCN has a comparable performance as the baseline GCNs.

$\begin{array}{l}\begin{array}{|c|c|c|c|} \hline & Cora & Citeseer & Pubmed \\ \hline GCN & 81.5 \pm 0.5 & 70.9 \pm 0.5 & 79.0 \pm 0.3 \\ \hline GAT & 83.0 \pm 0.7 & 72.5 \pm 0.7 & 79.0 \pm 0.3 \\ \hline RGCN & 82.8 \pm 0.6& 71.2 \pm 0.5 & 79.1 \pm 0.3 \\ \hline \end{array}\end{array}$

Fig. 6: Classification accuracies in percent, maybe delete this one [1]

Performance against Adversarial Attacks

As performance metrics the classification accuracy and the ratio of noise edges (noise edges to clean edges) are used.

Random Attack

The RGCN outperforms the other GCNs on all datasets, GAT seems to be especially sensitive to adversarial attacks, its attention mechanism apparently cannot adapt well enough to prevent adversarial attacks.

Fig. 7: Classification accuracy of different GCNs vs. noise edge ratio [1]

Targeted Attacks

Fig. 8 shows the change of classification accuracy for the RL-S2V attack, the NETTACK-Di and the NETTACK-In exemplary for the Cora dataset. The RGCN appears to be consistently more robust against all performed targeted attacks compared to the baseline models.

Fig. 8: Classification accuracy on Cora dataset vs. number of perturbations per node for RL-S2V (left), NETTACK-Di (middle) and NETTACK-In (right) [1]

Fig. 9 demonstrates that the attention mechanism actually increases the robustness. When $\begin{array}{l}\gamma\end{array}$ (see equation (4)) is too big message-passing in "real" edges may also be blocked and thus decrease the performance.

Fig. 9: Accuracy when using attention mechanism ( $\begin{array}{l}\gamma = 0\end{array}$ ) vs. not using attention mechanism ( $\begin{array}{l}\gamma > 0\end{array}$ ) [1]

Review

I think that the authors show a very interesting defense approach overall. I believe the potential of the proposed defense method lies in its generality. It works against different adversarial attacks and could according to the authors be extended to other graph models and more complicated graph structures like heterogeneous graphs or graphs with edge attributes [1].

In my opinion there could be more explanations on the math behind the approach to make it easier for the reader to understand what happens. For that, maybe some examples would have helped (e.g., relate the theory more to the actual document classification task). It would also have been interesting to get access to the actual implementation, which unfortunately is not available. Only the code of the baseline models and used attack methods can be found online (see links in [1]). The number of used layers in the experiments is always two, as this is the suggestion for the baseline models. Maybe varying the number of layers could have shown interesting results. Furthermore the results section of the paper is written in a rather general way, more details/measures on why the results are significantly better compared to the baseline methods could have helped to verify the results.

Since 2019 a variety of research papers about attack-agnostic defense strategies for GCNs has been published (see this github repository). For further research it would be interesting to see how this method compares to these strategies.

References

[1] Zhu, D., Zhang, Z., Cui, P., & Zhu, W. (2019). Robust Graph Convolutional Networks Against Adversarial Attacks. In A. Teredesai (Ed.), ACM Digital Library, Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining (pp. 1399–1407). Association for Computing Machinery. https://doi.org/10.1145/3292500.3330851

[2] Hamilton, W. L., Ying, R., & Leskovec, J. (2017, June 7). Inductive Representation Learning on Large Graphs. https://arxiv.org/pdf/1706.02216

[3] Kawalec, P. (2018). Transformations in Breakthrough Research: The Emergence of Mirnas as a Research Routine in Molecular Biology. Open Information Science, 2(1), 127–146. https://doi.org/10.1515/opis-2018-0010

[4] Kipf, T. N., & Welling, M. (2016, September 9). Semi-Supervised Classification with Graph Convolutional Networks. https://arxiv.org/pdf/1609.02907

[5] Veličković, P., Cucurull, G., Casanova, A., Romero, A., Liò, P., & Bengio, Y. (2017, October 30). Graph Attention Networks. https://arxiv.org/pdf/1710.10903

[6] Szegedy, C., Zaremba, W., Sutskever, I., Bruna, J., Erhan, D., Goodfellow, I., & Fergus, R. (2013, December 21). Intriguing properties of neural networks. https://arxiv.org/pdf/1312.6199

[7] Goodfellow, I. J., Shlens, J., & Szegedy, C. (2014, December 20). Explaining and Harnessing Adversarial Examples. https://arxiv.org/pdf/1412.6572

[8] Sun, L., Dou, Y., Yang, C., Wang, J., Yu, P. S., He, L., & Li, B. (2018, December 26). Adversarial Attack and Defense on Graph Data: A Survey. https://arxiv.org/pdf/1812.10528

Seitenhierarchie