Convolutional Neural Network

3 min readNov 29, 2022

A convolutional neural network (CNN) is a type of artificial neural network used primarily for image recognition and processing, due to its ability to recognize patterns in images. A CNN is a powerful tool but requires millions of labelled data points for training.

The data gets into the CNN through the input layer and passes through various hidden layers before getting to the output layer. The output of the network is compared to the actual labels in terms of loss or error. The partial derivatives of this loss w.r.t the trainable weights are calculated, and the weights are updated through one of the various methods using backpropagation.

CNN Template

Most of the commonly used hidden layers (not all) follow a pattern.

Layer Function
Pooling
Normalization
Activation
Loss Function

1. Layer Function

Basic transforming function such as convolutional or fully connected layer.

a. Fully Connected

Linear functions between the input and the output.

b. Convolutional Layers

These layers are applied to 2D (3D) input feature maps. The trainable weights are a 2D (3D) kernel/filter that moves across the input feature map, generating dot products with the overlapping region of the input feature map.

c. Transposed Convolutional (DeConvolutional) Layer

This is usually used to increase the size of the output feature map (Upsampling). The idea behind the transposed convolutional layer is to undo the convolutional layer.

2. Pooling

It is a non-trainable layer used to change the size of the feature map.

a. Max/Average Pooling

This decreases the spatial size of the input layer based on selecting the maximum/average value in receptive field defined by the kernel.

b. Unpooling

It is a non-trainable layer used to increase the spatial size of the input layer based on placing the input pixel at a certain index in the receptive field of the output defined by the kernel.

3. Normalization

It is usually used just before the activation functions to limit the unbounded activation from increasing the output layer values too high.

a. Local Response Normalization(LRN)

A non-trainable layer that square-normalizes the pixel values in a feature map within a local neighborhood.

b. Batch Normalization

A trainable approach to normalizing the data by learning scale and shift variable during training.

4. Activation

Introduce non-linearity so CNN can efficiently map non-linear complex mapping.

Non-parametric/ Static functions: Linear, ReLU
Parametric functions: ELU, tanh, sigmoid, Leaky ReLU
Bounded function: tanh, sigmoid

5. Loss Functions

Quantifies how far off the CNN prediction is from the actual labels.

Regression Loss Functions: MAE, MSE, Huber Loss
Classification Loss Functions: Cross-entropy, Hinge Loss

Thank you for reading!

Please leave comments if you have any suggestion/s or would like to add a point/s or if you noticed any mistake/typos!

P.S. If you found this article helpful, clap! 👏👏👏 [feels rewarding and gives the motivation to continue my writing].