18 Best practices for the real world

This chapter covers

Hyperparameter tuning
Model ensembling
Training Keras models on multiple GPUs or on TPU
Mixed-precision training
Quantization

You’ve come quite far since the beginning of this book. You can now train image classification models, image segmentation models, models for classification or regression on vector data, time series forecasting models, text classification models, sequence-to-sequence models, and even generative models for text and images. You’ve got all the bases covered.

However, your models so far have all been trained at a small scale – on small datasets, with a single GPU – and they generally haven’t reached the best achievable performance on each dataset we looked at. This book is, after all, an introductory book. If you are to go out in the real world and achieve state-of-the-art results on brand new problems, there’s still a bit of a chasm that you’ll need to cross.

This penultimate chapter is about bridging that gap, and giving you the best practices you’ll need as you go from machine-learning student to fully-fledged machine-learning engineer. We’ll review essential techniques for systematically improving model performance: hyperparameter tuning and model ensembling. Then we’ll look at how you can speed up and scale up model training, with multi-GPU and TPU training, mixed precision, and quantization.

18.1 Getting the most out of your models

18.1.1 Hyperparameter optimization

18.2 The art of crafting the right search space

18.3 Model ensembling

18.4 Scaling up model training with multiple devices

18.4.1 Multi-GPU training

18.5 TPU training

18.6 Leveraging step fusing to improve TPU utilization

18.7 Speeding up training and inference with lower-precision computation

18.7.1 Understanding floating-point precision

18.8 Float16 inference

18.9 Mixed-precision training

18.10 Using loss scaling with mixed precision

18.11 Beyond mixed precision: float8 training

18.12 Faster inference with quantization

18.13 Chapter summary