Convolutional Neural Network
A convolutional neural network (CNN) is a type of artificial neural network used primarily for image recognition and processing, due to its ability to recognize patterns in images. A CNN is a powerful tool but requires millions of labelled data points for training.
The data gets into the CNN through the input layer and passes through various hidden layers before getting to the output layer. The output of the network is compared to the actual labels in terms of loss or error. The partial derivatives of this loss w.r.t the trainable weights are calculated, and the weights are updated through one of the various methods using backpropagation.
CNN Template
Most of the commonly used hidden layers (not all) follow a pattern.
- Layer Function
- Pooling
- Normalization
- Activation
- Loss Function
1. Layer Function
Basic transforming function such as convolutional or fully connected layer.
a. Fully Connected
Linear functions between the input and the output.
b. Convolutional Layers
These layers are applied to 2D (3D) input feature maps. The trainable weights are a 2D (3D) kernel/filter that moves across the input feature map, generating dot products with the overlapping region of the input feature map.
c. Transposed Convolutional (DeConvolutional) Layer
This is usually used to increase the size of the output feature map (Upsampling). The idea behind the transposed convolutional layer is to undo the convolutional layer.
2. Pooling
It is a non-trainable layer used to change the size of the feature map.
a. Max/Average Pooling
This decreases the spatial size of the input layer based on selecting the maximum/average value in receptive field defined by the kernel.
b. Unpooling
It is a non-trainable layer used to increase the spatial size of the input layer based on placing the input pixel at a certain index in the receptive field of the output defined by the kernel.
3. Normalization
It is usually used just before the activation functions to limit the unbounded activation from increasing the output layer values too high.
a. Local Response Normalization(LRN)
A non-trainable layer that square-normalizes the pixel values in a feature map within a local neighborhood.
b. Batch Normalization
A trainable approach to normalizing the data by learning scale and shift variable during training.
4. Activation
Introduce non-linearity so CNN can efficiently map non-linear complex mapping.
- Non-parametric/ Static functions: Linear, ReLU
- Parametric functions: ELU, tanh, sigmoid, Leaky ReLU
- Bounded function: tanh, sigmoid
5. Loss Functions
Quantifies how far off the CNN prediction is from the actual labels.
- Regression Loss Functions: MAE, MSE, Huber Loss
- Classification Loss Functions: Cross-entropy, Hinge Loss
Thank you for reading!
Please leave comments if you have any suggestion/s or would like to add a point/s or if you noticed any mistake/typos!
P.S. If you found this article helpful, clap! 👏👏👏 [feels rewarding and gives the motivation to continue my writing].