Deep Learning with Python, Third Edition cover
welcome to this free extract from
an online version of the Manning book.
to read more
or

11 Image segmentation

 

This chapter covers

  • The different branches of computer vision: image classification, image segmentation, and object detection
  • Building a segmentation model from scratch
  • Using the pretrained Segment Anything model

The previous chapter gave you a first introduction to deep learning for computer vision, via a simple use case: binary image classification. But there’s more to computer vision than image classification! This chapter dives deeper into another essential computer vision application – image segmentation.

11.1 Computer vision tasks

So far, we’ve focused on image classification models: an image goes in, a label comes out. “This image likely contains a cat, this other one likely contains a dog”. But image classification is only one of several possible applications of deep learning in computer vision. In general, there are three essential computer vision tasks you need to know about:

11.1.1 Types of image segmentation

11.2 Training a segmentation model from scratch

11.2.1 Downloading a segmentation dataset

11.2.2 Building and training the segmentation model

11.3 Using a pretrained segmentation model

11.3.1 Downloading the Segment Anything model

11.3.2 How Segment Anything works

11.3.3 Prepare a test image

11.3.4 Prompting the model with a target point

11.3.5 Prompting the model with a target box

11.4 Chapter summary