The Complete Obsolete Guide to Generative AI cover
welcome to this free extract from
an online version of the Manning book.
to read more
or

4 Creating with media resources

 

This chapter covers

  • Generating digital images and video
  • Generating AI-assisted video editing and text-to-video
  • Generating presentation resources
  • Generating audio-to-text and text-to-audio

Text and programming code are natural targets for generative AI. After all, after binary, those are the languages with which your computer has the most experience. So, intuitively, the ability to generate the kinds of resources we discussed in the previous chapter was expected.

But images, audio, and video would be a very different story. That’s because visual and audio data

  • Are inherently more complex and high-dimensional than text
  • Lack symbolic representations and have more nuanced meaning, making it challenging to directly apply traditional programming techniques
  • Can be highly subjective and ambiguous, making it difficult to build automated systems that can consistently and accurately interpret such data
  • Lack inherent context, making it harder for computer systems to confidently derive meaning
  • Require significant computational resources for processing

Nevertheless, tools for generating media resources have been primary drivers of the recent explosion of interest in AI. So the rest of this chapter will be dedicated to exploring the practical use of AI-driven digital media creation services.

Generating images

Providing detailed prompts

Prompting for images

Generating video

AI-assisted video editing

Text-to-video slide shows

Generating presentation resources

Generating voice

Audio transcriptions

Generating music

Try this for yourself

Summary