ViewTube

Skip

Code Panoptic Image Segmentation w/ Vision Transformer & Mask2Former - A PyTorch tutorial

5,253 views

code_your_own_AI

32.4K subscribers

Sun, 26 Feb 2023 00:00:00 GMT

Tags

Mask2Former

Image Segmentation

Panoptic Segmentation

Semantic Segmentation

Image tech

Vision tech

Key innovation is to have a Transformer decoder come up with a set of binary masks and classes in a parallel way. This was then improved in the MaskFormer paper, which showed that the "binary mask classification" paradigm also works really well for semantic segmentation. Mask2Former extends this to instance segmentation by further improving the neural network architecture. its key components include masked attention, which extracts localized features by constraining cross-attention within predicted mask regions. Hence, we've evolved from separate architectures to what researchers now refer to as "universal image segmentation" architectures, capable of solving any image segmentation task. Interestingly, these universal models all adopt the "mask classification" paradigm, discarding the "per-pixel classification" paradigm entirely. from recommended Huggingface Blog (all rights and credits with them): https://huggingface.co/blog/mask2former "We present Masked-attention Mask Transformer (Mask2Former), a new architecture capable of addressing any image segmentation task (panoptic, instance or semantic). Its key components include masked attention, which extracts localized features by constraining cross-attention within predicted mask regions. In addition to reducing the research effort by at least three times, it outperforms the best specialized architectures by a significant margin on four popular datasets. Most notably, Mask2Former sets a new state-of-the-art for panoptic segmentation (57.8 PQ on COCO), instance segmentation (50.1 AP on COCO) and semantic segmentation (57.7 mIoU on ADE20K). " from https://huggingface.co/docs/transformers/main/model_doc/mask2former Arxiv pre-print (all rights with authors): Masked-attention Mask Transformer for Universal Image Segmentation Bowen Cheng, Ishan Misra, Alexander G. Schwing, Alexander Kirillov, Rohit Girdhar https://arxiv.org/abs/2112.01527 #ai #imagesegmentation #transformers

ViewTube

Recommended videos

Code Panoptic Image Segmentation w/ Vision Transformer & Mask2Former - A PyTorch tutorial

9 Comments