9 GPT-ing on the go

This chapter covers

Running a large language model locally
Comparing the results of two locally hosted large language models against those of ChatGPT
Determining when using offline models is appropriate

Imagine you are on your way to an AI conference halfway around the world. You are on a plane, cruising at 35,000 feet above the ground, and you want to prototype a new feature for your application. The airplane’s Wi-Fi is prohibitively slow and expensive. What if instead of paying all that money for a broken and borderline unusable GPT, you have one running right there on your laptop, offline? This chapter will review developers’ options to run a large language model (LLM) locally.

9.1 Motivating theory

The introductory scenario is not too far a stretch. Although the ubiquity of high-speed internet is increasing, it has not yet achieved total coverage. You will find yourself in areas without broadband, whether at home, on the road, at school, or in the office. Hopefully, this book has successfully made the case that you should be using LLMs as a tool in your developer toolbelt. For this reason, you need to take precautions to ensure that you always have an LLM available to you in some capacity. As you use it, the more you will get from it. Like your dependency on an integrated development environment, without it, you are still a good developer; with it, however, you are much more.

9 GPT-ing on the go

This chapter covers

9.1 Motivating theory

9.2 Hosting your own LLM

9.2.1 Baselining with ChatGPT

9.2.2 Asking Llama 2 to spit out an answer

9.2.3 Democratizing answers with GPT-4All

Summary