The Quick Python Book, Fourth Edition cover
welcome to this free extract from
an online version of the Manning book.
to read more
or

22 Data over the network

 

This chapter covers

  • Fetching files via FTP/SFTP, SSH/SCP, and HTTPS
  • Getting data via APIs
  • Structured data file formats: JSON and XML
  • Scraping data

You’ve seen how to deal with text-based data files. In this chapter, you use Python to move data files over the network. In some cases, those files might be text or spreadsheet files, as discussed in chapter 21, but in other cases, they might be in more structured formats and served from REST or SOAP APIs. Sometimes getting the data may mean scraping it from a website. This chapter discusses all of these situations and shows some common use cases.

22.1 Fetching files

Before you can do anything with data files, you have to get them. Sometimes this process is very easy, such as manually downloading a single zip archive, or maybe the files have been pushed to your machine from somewhere else. Quite often, however, the process is more involved. Maybe a large number of files need to be retrieved from a remote server, files need to be retrieved regularly, or the retrieval process is sufficiently complex to be a pain to do manually. In any of those cases, you might well want to automate fetching the data files with Python.

22.1.1 Using Python to fetch files from an FTP server

22.1.2 Fetching files with SFTP

22.1.3 Retrieving files over HTTP/HTTPS

22.2 Fetching data via an API

22.3 Structured data formats

22.3.1 JSON data

22.3.2 XML data

22.4 Scraping web data

22.5 Tracking the weather

22.5.1 Solving the problem with AI-generated code

22.5.2 Solutions and discussion

Summary