Deep Learning with Python, Third Edition cover
welcome to this free extract from
an online version of the Manning book.
to read more
or

12 Object detection

 

This chapter covers

  • Understanding the object detection problem
  • Two-stage and single-stage object detectors
  • Training a simple single-stage detector from scratch
  • Using a pretrained object detector

Object detection is all about drawing boxes (called “bounding boxes”) around objects of interest in a picture (see Figure 12.1). This enables you to know not just which objects are in a picture, but also where they are. Some of its most common applications are:

  • Counting: Find out how many instances of an object are in an image.
  • Tracking: Track how objects move in a scene over time by performing object detection on every frame of a movie.
  • Cropping: Identify the area of an image that contains an object of interest, in order to crop it and send a higher-resolution version of the image patch to a classifier or an Optical Character Recognition (OCR) model.
Figure 12.1 Object detectors draw boxes around objects in an image and label them.
object detection

You might be wondering – if I have a segmentation mask for an object instance, I can already compute the coordinates of the smallest box that contains the mask. So couldn’t we just use image segmentation all the time? Do we need object detection models at all?

12.1 Single-stage vs two-stage object detectors

12.1.1 Two-stage R-CNN detectors

12.1.2 Single-stage detectors

12.2 Training a YOLO model from scratch

12.2.1 Downloading the COCO dataset

12.2.2 Creating a YOLO model

12.2.3 Readying the COCO data for the YOLO model

12.3 Training the YOLO model

12.4 Using a pretrained RetinaNet detector

12.5 Chapter summary