
1 How AI works
This chapter covers
- The way LLMs process inputs and generate outputs
- The transformer architecture that powers LLMs
- Different types of machine learning
- How LLMs and other AI models learn from data
- How convolutional neural networks are used to process different types of media with AI
- Combining different types of data (e.g., producing images from text)
This chapter clarifies how AI works, discussing many foundational AI topics. Since the latest AI boom, many of these topics (e.g., “embeddings” and “temperature”) are now widely discussed, not just by AI practitioners but also by businesspeople and the general public. This chapter demystifies them.
Instead of just piling up definitions and writing textbook explanations, this chapter is a bit more opinionated. It points out common AI problems, misconceptions, and limitations based on my experience working in the field, as well as discussing some interesting insights you might not be aware of. For example, we’ll discuss why language generation is more expensive in French than in English and how OpenAI hires armies of human workers to manually help train ChatGPT. So, even if you are already familiar with all the topics covered in this chapter, reading it might provide you with a different perspective.
The first part of this chapter is a high-level explanation of how large language models (LLMs) such as ChatGPT work. Its sections are ordered to roughly mimic how LLMs themselves turn inputs into outputs one step at a time.