
Generative AI (genAI) tools are trained on massive data sets (of words, images, videos, and code). They use those data sets to predict, automate, and generate output that mimics human-generated content.
For instance, ChatGPT is an artificial intelligence software that composes answers to user-submitted questions or prompts. Its free version (ChatGPT 3.5 as of December 2023) relies on a database of billions of words (scraped from the internet, prior to 2022) and was trained by humans to generate a sequence of words that mimic human writing and conventions. It can also revise its output, based on clarifying ideas or questions or prompts, adjusting the content or tone or style of its generated answers. The term “intelligence,” however, is a bit misleading. The responses ChatGPT generates often include false or unreliable information (often called “hallucinations”). Additionally, ChatGPT lacks the ability to analyze (since it is just predicting text), and it also cannot provide information about current events (it’s dataset currently maxes out at 2021).
Image generators, such as Dall-E or Midjourney act in similar ways, trained on high volumes of images that have been tagged with descriptive terms that can then be mimicked and reconstituted to match the words provided in the prompt. Video generators also exist, along with code generators such as Codex or GitHub Copilot that turn plain language into code or make predictive suggestions for improving computer programming code.
A good place to start building your understanding of how genAI works is with these three articles, written for popular media. Feel free to share them with your students, too.