Learn Generative AI with PyTorch cover
welcome to this free extract from
an online version of the Manning book.
to read more
or

8 Text generation with recurrent neural networks

This chapter covers

  • The idea behind RNNs and why they can handle sequential data
  • Character tokenization, word tokenization, and subword tokenization
  • How word embedding works
  • Building and training an RNN to generate text
  • Using temperature and top-K sampling to control the creativeness of text generation

So far in this book, we have discussed how to generate shapes, numbers, and images. Starting from this chapter, we’ll focus mainly on text generation. Generating text is often considered the holy grail of generative AI for several compelling reasons. Human language is incredibly complex and nuanced. It involves understanding not only grammar and vocabulary but also context, tone, and cultural references. Successfully generating coherent and contextually appropriate text is a significant challenge that requires deep understanding and processing of language.

As humans, we primarily communicate through language. AI that can generate human-like text can interact more naturally with users, making technology more accessible and user-friendly. Text generation has many applications, from automating customer service responses to creating entire articles, scripting for games and movies, aiding in creative writing, and even building personal assistants. The potential effect across industries is enormous.

8.1 Introduction to RNNs

8.1.1 Challenges in generating text

8.1.2 How do RNNs work?

8.1.3 Steps in training a LSTM model

8.2 Fundamentals of NLP

8.2.1 Different tokenization methods

8.2.2 Word embedding

8.3 Preparing data to train the LSTM model

8.3.1 Downloading and cleaning up the text

8.3.2 Creating batches of training data

8.4 Building and training the LSTM model

8.4.1 Building an LSTM model

8.4.2 Training the LSTM model

8.5 Generating text with the trained LSTM model

8.5.1 Generating text by predicting the next token

8.5.2 Temperature and top-K sampling in text generation

Summary