[深度学习论文笔记][Adversarial Examples] Explaining and Harnessing Adversarial Examples

来源:互联网 发布:中国大数据时代 编辑:程序博客网 时间:2024/05/20 07:18

Goodfellow, Ian J., Jonathon Shlens, and Christian Szegedy. “Explaining and harnessing adversarial examples.” arXiv preprint arXiv:1412.6572 (2014). (Citations: 129).


10.3.1 Fast Gradient Sign Method
Suppose we want to permutate X a little bit as X + ε, We can make many infinitesimal changes to the input that add up to one large change to the output.

Goal

Linearize the cost function

See Fig.



2 Analysis
CNNs work well on naturally occuring data, but is exposed as a fake when one visits points in space that do not have high probability in the data distribution.

Adversarial examples can be explained as a property of high-dimensional dot products. They are a result of models being too linear, rather than too nonlinear. Generic regularization strategies such as dropout, pretraining, and model averaging do not confer a significant reduction in a model’s vulnerability to adversarial examples, but changing to nonlinear model families such as RBF networks can do so.


The generalization of adversarial examples across different models can be explained as a result of adversarial perturbations being highly aligned with the weight vectors of a model, and different models learning similar functions when trained to perform the same task. 


By training on a mixture of adversarial and clean examples, a neural network could be regularized somewhat.

In particular, this is not a problem with Deep Learning, and has little to do with ConvNets specifically. Same issue would come up with Neural Nets in any other modalities.

0 0