Understanding Convolutions by hand vs TensorFlow

Do you think we can match TensorFlow by hand? You bet!

Steven Smiley

Published in

Towards Data Science

6 min readMay 23, 2021

1. Purpose

TensorFlow and various other open source libraries for machine learning like SciPy, provide these nice built in functions for performing convolutions. However, as nice as these functions are, it is worth opening the hood to discover the power behind the code. In my opinion, without the convolutional layer, computer vision would be as blind as a bat. So I hope you enjoy this article because we will dig into the convolutions that make up convolutional layers and see the big picture together.

The Jupyter Notebooks I made for this are on my GitHub.

2. Background

Convolutions are not new with deep learning tasks in computer vision. They are simply a technique used in image processing. In image processing, a convolution operation is the process of summing each element of the input image with its local neighbors, weighted by the kernel.¹ The output size will then depend on the following:

3. Input Filter

Kernels can come in all shapes and sizes. These kernels make up filters, which are a parameter used in convolutional layers. For this article, I will stick with a common filter based on Edge Detection¹:

4. Input Volume

In order to make the math easy to check for an example, I will use a random input volume of 5x5x3:

5. Zero-Padding

Usually zero-padding is added to input images. So in this example, I will add a zero-padding, P=1, which returns an input volume of 7x7x3:

6. Stride

For this example, the stride will be S=2.

7. Hand Calculate the Output Volume

Passing the filter over the input volume’s local receptive field at the top left:

Moving by the stride, S=2:

Putting this into function by hand:

Passing the filter over the entire input volume now:

8. Using TensorFlow

Awesome! Matched what I did by hand in the previous section. Now lets try passing an image through it with a stride, S=1.

And with TensorFlow:

As expected! TensorFlow and the custom by hand function produce the same convolved image. The edge detection filter really makes more sense now since you can visualize the lines around the toy reindeer.

This is one of those things that makes Convolutional Neural Networks (CNNs) so great. As the convolutional layers are trained, the weights (kernels) are updated, and hopefully meaningful feature maps are created to perform your computer vision tasks with a trained model.

Thank you for reading! I hope you learned something new about convolutions and were able to see the big picture after this if you couldn’t already.

9. References

1. Kernel (image processing)[https://en.wikipedia.org/wiki/Kernel_(image_processing)]

2. tf.nn.conv2d [https://www.tensorflow.org/api_docs/python/tf/nn/conv2d]

3. Python. a) Travis E. Oliphant. Python for Scientific Computing, Computing in Science & Engineering, 9, 10–20 (2007) b) K. Jarrod Millman and Michael Aivazis. Python for Scientists and Engineers, Computing in Science & Engineering, 13, 9–12 (2011)

4. TensorFlow. Martín Abadi, Ashish Agarwal, Paul Barham, Eugene Brevdo, Zhifeng Chen, Craig Citro, Greg S. Corrado, Andy Davis,Jeffrey Dean, Matthieu Devin, Sanjay Ghemawat, Ian Goodfellow, Andrew Harp, Geoffrey Irving, Michael Isard, Rafal Jozefowicz, Yangqing Jia,Lukasz Kaiser, Manjunath Kudlur, Josh Levenberg, Dan Mané, Mike Schuster,Rajat Monga, Sherry Moore, Derek Murray, Chris Olah, Jonathon Shlens,Benoit Steiner, Ilya Sutskever, Kunal Talwar, Paul Tucker,Vincent Vanhoucke, Vijay Vasudevan, Fernanda Viégas,Oriol Vinyals, Pete Warden, Martin Wattenberg, Martin Wicke,Yuan Yu, and Xiaoqiang Zheng. TensorFlow: Large-scale machine learning on heterogeneous systems, 2015. Software available from tensorflow.org.

5. SciPy. Pauli Virtanen, Ralf Gommers, Travis E. Oliphant, Matt Haberland, Tyler Reddy, David Cournapeau, Evgeni Burovski, Pearu Peterson, Warren Weckesser, Jonathan Bright, Stéfan J. van der Walt, Matthew Brett, Joshua Wilson, K. Jarrod Millman, Nikolay Mayorov, Andrew R. J. Nelson, Eric Jones, Robert Kern, Eric Larson, CJ Carey, İlhan Polat, Yu Feng, Eric W. Moore, Jake VanderPlas, Denis Laxalde, Josef Perktold, Robert Cimrman, Ian Henriksen, E.A. Quintero, Charles R Harris, Anne M. Archibald, Antônio H. Ribeiro, Fabian Pedregosa, Paul van Mulbregt, and SciPy 1.0 Contributors. (2019) SciPy 1.0–Fundamental Algorithms for Scientific Computing in Python. preprint arXiv:1907.10121

6. NumPy. a) Travis E. Oliphant. A guide to NumPy, USA: Trelgol Publishing, (2006). b) Stéfan van der Walt, S. Chris Colbert and Gaël Varoquaux. The NumPy Array: A Structure for Efficient Numerical Computation, Computing in Science & Engineering, 13, 22–30 (2011)

7. IPython. a) Fernando Pérez and Brian E. Granger. IPython: A System for Interactive Scientific Computing, Computing in Science & Engineering, 9, 21–29 (2007)

8. Matplotlib. J. D. Hunter, “Matplotlib: A 2D Graphics Environment”, Computing in Science & Engineering, vol. 9, no. 3, pp. 90–95, 2007.

9. Pandas. Wes McKinney. Data Structures for Statistical Computing in Python, Proceedings of the 9th Python in Science Conference, 51–56 (2010)

10. Scikit-Learn. Fabian Pedregosa, Gaël Varoquaux, Alexandre Gramfort, Vincent Michel, Bertrand Thirion, Olivier Grisel, Mathieu Blondel, Peter Prettenhofer, Ron Weiss, Vincent Dubourg, Jake Vanderplas, Alexandre Passos, David Cournapeau, Matthieu Brucher, Matthieu Perrot, Édouard Duchesnay. Scikit-learn: Machine Learning in Python, Journal of Machine Learning Research, 12, 2825–2830 (2011)

11. Scikit-Image. Stéfan van der Walt, Johannes L. Schönberger, Juan Nunez-Iglesias, François Boulogne, Joshua D. Warner, Neil Yager, Emmanuelle Gouillart, Tony Yu and the scikit-image contributors. scikit-image: Image processing in Python, PeerJ 2:e453 (2014)