>

Semantic Segmentation from scratch in PyTorch.

In this blog, we will use DeepLabv3+ architecture to build our person segmentation pipeline entirely from scratch. DeepLabv3+ Architecture: The DeepLabv3 paper was introduced in “Rethinking Atrous Convolution for Semantic Image Segmentation”. After DeepLabv1 and DeepLabv2 are invented, authors tried to RETHINK or restructure the DeepLab architecture and finally come up with a more enhanced DeepLabv3. The DeepLabv3+ was introduced in “Encoder-Decoder with Atrous Separable Convolution for Semantic Image Segmentation” paper....

July 25, 2023 · 12 min · Rajan Ghimire

Vision Transformer (ViT)

Transformers were widely used in the field of natural language processing when they were first developed. Many researchers have begun using the Transformer architecture in other domains, like computer vision, as a result of Transformers’ success in the field of Natural Language Processing (NLP). One such architecture, called the Vision Transformer, was developed by Google Research and Brain Team to tackle the challenge of image classification. Naturally, you must have prior knowledge of how Transformers function and the issues it addressed in order to grasp how ViT operates....

February 6, 2023 · 15 min · Rajan Ghimire