Discussion - Fully Convolutional Network vs. Autoencoder

1 minute read

I don’t really think this is a fully convolutional network (FCN), but rather an auto-encoder for denoising (it’s true that then auto-encoder doesn’t necessarily need dense layers). FCN should be able to take images of any size and aim to detect objects in an image.

From video Developing and Training Fully Convolutional Network in Tensorflow

Discussion

Thanks for your comment. I think you are right with the emphasis on the task for which the network is used.

In the example, I use it for a denoising task so the description of an autoencoder matches: “An autoencoder is a neural network that is trained to attempt to copy its input to its output” according to the Deeplearningbook p. 502.

In the original paper FCNs are used for segmentation. They are described as an architecture which does not use any dense layers: “While a general deep net computes a general nonlinear function, a net with only layers of this form computes a nonlinear filter, which we call a deep filter or fully convolutional network.” This form refers to using only locally connected layers. From this absence of dense layers also follows that the input can be of arbitrary size.

In conclusion, FCN and autoencoder both describe network architectures and are not mutually exclusive terms. You are right however that based on how the terms are commonly used, the network from the example should be called autoencoder and not FCN.

References

Deeplearningbook: Ian Goodfellow, Yoshua Bengio, and Aaron Courville. Deep Learning, 2016. https://www.deeplearningbook.org.

Paper: Introduced in Long, Jonathan, Evan Shelhamer, and Trevor Darrell. “Fully Convolutional Networks for Semantic Segmentation.” ArXiv:1411.4038 [Cs], March 8, 2015. http://arxiv.org/abs/1411.4038.