By Yuanbo Wang, Invitae Corporation, USA, mcdy143@gmail.com | Unaiza Ahsan, The Home Depot, USA, unaiza_ahsan@homedepot.com | Hanyan Li, Indeed Inc., USA, f1annlee@gmail.com | Matthew Hagen, Amazon.com, Inc., USA, mathage@amazon.com
Image segmentation is the task of associating pixels in an image with their respective object class labels. It has a wide range of applications in many industries including healthcare, transportation, robotics, fashion, home improvement, and tourism. Many deep learning-based approaches have been developed for image-level object recognition and pixel-level scene understanding — with the latter requiring a much denser annotation of scenes with a large set of objects. Extensions of image segmentation tasks include 3D and video segmentation, where units of voxels, point clouds, and video frames are classified into different objects. We use “Object Segmentation” to refer to the union of these segmentation tasks. In this monograph, we investigate both traditional and modern object segmentation approaches, comparing their strengths, weaknesses, and utilities. We examine in detail the wide range of deep learning-based segmentation techniques developed in recent years, provide a review of the widely used datasets and evaluation metrics, and discuss potential future research directions.
Automated visual recognition tasks such as image classification, image captioning, object detection and image segmentation are essential for image and video processing. Of these, image segmentation is the task of associating pixels in an image with their respective object class labels. It has a wide range of applications within many industries, including healthcare, transportation, robotics, fashion, home improvement, and tourism.
In this monograph, both traditional and modern object segmentation approaches are investigated, comparing their strengths, weaknesses, and utilities. The main focus is on the deep learning-based techniques for the two most widely solved segmentation tasks: Semantic Segmentation and Instance Segmentation. A wide range of deep learning-based segmentation techniques developed in recent years are examined. Various themes emerge from these techniques that push machines to their limits, and often deviate from human perception principles. In addition, an overview of the widely used benchmark datasets for each of these techniques, along with the respective evaluation metrics to measure the models’ performances, are presented. Potential future research directions conclude the monograph.
This monograph serves as a good introduction to the automated visual recognition task of image segmentation and is intended for students and professionals.