Understanding Markov Chains

This assignment, by Gabriel Egan, is from the TextGenEd collection in the WAC Clearinghouse Repository.

The abstract from the site explains:

In this undergraduate assignment, students use a manually applied algorithm to generate a Markov Chain from a given short extract of language. Included here are precise instructions with diagrams for two activities where students develop structures to generate text based on probabilities. Through these game-like activities, students discover that Markov Chains efficiently embody the writer’s preference for following one particular word with another, which lays the foundation for discussion of how probabilistic language-generation models work. The assignment gives students a concrete way to explore and visualise the building blocks of various language models and understand their implications for linguistics. Any students able to distinguish the essential parts-of-speech such as verb, noun, article, adjective, and relative pronoun should be able to complete the assignment with proper support. (All students able to speak English will already have learnt the meaning of these terms at some point, but a short refresher might be wanted to bring everyone up to the same speed in identifying examples of them in practice.) The assignment has been used to help Creative Writing students understand how Artificial Intelligence is able to produce writing that sounds like it came from a human. In the “Follow Up” section suggestions are given for how more specialist linguistic teaching can be built on this basis, including an exploration of the competing theories for how humans generate new sentences.

Key Features of This Assignment

Hands-on Algorithm Application
Students manually apply an algorithm to generate a Markov Chain from a given text extract, providing a concrete and interactive way to understand probabilistic language generation.
Visual and Structural Learning
The assignment includes detailed instructions and diagrams for activities that help students visualize and develop structures to generate text based on word probabilities, enhancing their understanding of language models.
Foundation for Advanced Discussions
By exploring how Markov Chains predict word sequences, students lay the groundwork for more advanced discussions on probabilistic models in linguistics and AI, with suggestions for further linguistic explorations included in the follow-up section.