The earliest and simplest method to generate adversarial examples is the Fast Gradient Sign Method (FGSM) as introduced in Explaining and Harnessing Adversarial Examples by Goodfellow, I. et al. This non-iterative method generates examples in one step and leads to robust adversaries. It computes a step of gradient descent and moves one step of magnitude $\epsilon$ into the direction of this gradient:

$$\tag{1.1} \widetilde{X} = X + \eta$$

$$\tag{1.2} \eta = \epsilon sign(\nabla_{x} J(\Theta, x, y))$$

Essentially, FGSM takes one step to increase the cost function with the correct label, hoping that this will be enough to change the top prediction. The main benefit of this technique is it takes relatively little computational time to create adversarial images.

One downside of the FGSM is that the manipulated images are often perceptible for humans for anything but the smallest changes in pixel values. This may be because this method can only modify pixel values upwards or downwards a constant value rather than a seemingly random value. Additionally, manipulations using this technique are particularly noticeable around the darker areas of an image because the relative magnitude of manipulation compared to the original image’s pixel values. This can be improved by using iterative methods.

The notebook is available here.

We first load and preprocess the data as previously explained. The attack is implemented as:

In the data preparation step, the data is both normalized and standardized; therefore $\epsilon$ has to be as well. $\epsilon$ is normalized by dividing it by 255 and standardized by dividing it by the standard deviation of the data (line 17). The images are then clipped to ensure that the image’s pixel values are between 0 and 255 after destandardization and denormalization.

We investigate how the FGSM attack performs in the Results section.

Updated: