The idea for this activity was stolen from my colleague Paul Fyfe of Florida State University; I first saw Prof. Fyfe present his assignment at MLA 2012. Prof. Fyfe describes his version of the assignment in “How Not to Read a Victorian Novel,” Journal of Victorian Culture 16, no. 1 (April 2011). Here’s how Paul introduces the assignment for his students:
Franco Moretti was dissatisfied with how literary scholars accept just a handful of possible texts as representative of cultural eras. Even if those texts are diverse and interesting, how can they possibly represent broader trends at scale? Moretti wants to change our sense of literary history by enlarging it, or by increasing our critical distance from it. He coined the phrase “distant reading” as an approach to analyzing lots and lots of texts instead of an unrepresentative few. Distant reading uses other modes of analysis and models of interpretation than the “close reading” we are familiar with. In his own work, Moretti compiles textual information from lots and lots of novels into maps, graphs, and logical trees. Seen this way, texts can reveal new patterns and language trends than we could otherwise discover close up. An array of digital visualization and text analysis tools now make Moretti’s methods more accessible to the casual user. The first paper will be an experiment in using these tools. We will consider “distance” not only as the subject of our course but also as a potential mode of reading and interpretation. What does literary criticism and analysis look like if we accept distance “as a condition of knowledge”?
Distance is a pretty good approach to the Victorian novel, considering that 40,000+ books of prose fiction were published in the last two-thirds of the nineteenth century. No one can read them all. But perhaps we can learn how to not read them. As Moretti and others have demonstrated, digital technology provides lots of interesting ways of doing this. Using some selected tools, you will analyze a big Victorian novel and then write a paper explaining your questions and insights. There’s one catch: it has to be a book you have never read.
English classes more typically emphasize close reading than “not reading.” This exercise will be new to many of you. So will the technology and the interfaces. The paper requires thinking about texts in a very different way than you might be used to. There may be dead ends; on the other hand, there will be no wrong answers. This preludes two important points:
- Play. Experiment. This assignment is as much about testing the methods as it is learning about the text. The goal here is not to reconstruct a missing story, but to “read” the novel in a fundamentally different way, and to think about the implications of doing so.
- Ask for help. Please don’t struggle with the technology, or tear hair in confusion about the assignment. Visit my office hours or email for an appointment if you’d like to go over this, work out a problem, or discuss how to talk about your results.
- Use frustration creatively. This is perhaps the hardest and most essential trick. If you hit a dead end, feel frustrated, or get null results, how can you use that to learn? In other words, what might be the values of that frustration or failure in thinking about your critical approach? Try to take any moment of frustration as instead an opportunity to reflect on the kinds of questions you are asking and how you might change them.
Ready to get started?
Okay—got all that? Here’s how we’ll be engaging in the kinds of experimentation and play that Prof. Fyfe describes in today’s proseminar:
Choose a work to not read.
Remember that you must choose something you’ve never read before. I recommend that you choose a text you know is closely related to the key work you’ve been studying throughout the proseminar: e.g. an influencing work; another work by the same author; a work by someone in your author’s literary, political, or social circle; a work that shares themes with your key work; a work that critics of your key work often cite in their analyses. You must either choose a text that you can find the entire text for online, perhaps on Project Gutenberg, or you must scan and OCR (we’ll discuss this in class) the work before analyzing it.
Download the “plain text” (.txt) version of the text to your computer. Open that file in a plain text editor—TextEdit on the Mac or Wordpad on Windows—and delete all of the text at the front and back of the file that aren’t the text itself. You want to file to include only the words of the novel itself, not any of the legal language or the metadata. Save the file as a plain text (.txt) file.
Make some predictions
What do you think this work is about? How do you think it relates to your key work? Can you list some predicted themes, characters, plot elements, or stylistic characteristics of the text before reading a word of it? Write your ideas down in a document you can refer back to later.
Make word clouds
When provided with a bunch of text, tag cloud or word cloud engines will return you a graphical representation of the most common words: the more frequently a word appears in the text, the larger it appears relative to other words on the screen. On the ProfHacker blog, Julie Meloni called word clouds a “gateway drug” to textual analysis. Wordle is nice for making word clouds because, once your word cloud gets generated, you can toggle common English words (e.g. and, the, if) on or off, and you can customize or even “randomize” the display, allowing you different visualizations of the data. Using the text of your chosen work, experiment with Wordle until you get comfortable with the interface. Then run a couple of different tests with Wordle, making notes of your observations along the way:
- Generate a cloud for the whole text. How you might “read” this? Come up with a few different observations. What kinds of words are there? Are there patterns or in/consistencies in the words? In what is relatively more or less frequent?
- Try breaking the book into chapters or sections. Paste individual sections in, generate word clouds, and see what you can regenerate from a “distant” perspective.
- Play with stoplists: in Wordle, toggle on/off the common English words. (You can also create your own custom stoplist, which is a little more advanced.)
Reveal your texts
Word clouds are a first step. Next, we will run (slightly) more sophisticated text analysis software on the file using tools provided by Voyant (Voyant has had server troubles lately; if that link doesn’t work, use this link to the software on another server. Upload the plain text version of your chosen novel and click “reveal.” Initially Voyant’s results will look much like Wordle’s. You’ll see a word cloud in the top left corner of the screen (You can turn on stoplists for the wordcloud by clicking the gear icon at the top of the wordcloud window), a summary of results below it, and the full text of your chosen work in the center. If you click “more…” in the summary window, however, another window will open below it showing the “words in the entire corpus.” “Corpus” means “a collection of written works,” and Voyant can be used to analyze many texts together; in this case, however, your corpus is one work.
Look at the words by frequency. You might have to scroll through a few pages before you get past common words such as “the,” “and,” and so on. What are the first few less common words that appear most frequently in your novel? Double click one of the words listed, and a new set of tools will open on the right side of the window. You can look at “word trends,” which plots the relative frequency of words at different points in your novel. Below this you can click to open “Keywords in context,” which shows the words that appear around the word you’re analyzing within the text. If you look at the text in the center of the window, you’ll see that there’s now a “heat map” running along its left-hand margin which shows where your chosen word appears most frequently within the text. Jot down some notes about this word, and then compare those results with several other words in the “Words in the Entire Corpus” menu.
Some questions to consider as you play with Voyant: does more focused attention to word frequency change your opinions about your book? What about scarce or infrequent words? What still don’t you know? In other words, what additional information might you need to gain insights? What insights, if any, do these tools provide? What keywords or patterns did you pursue and why? What might you suspect are the values and/or limitations of “not reading” this way? Where might it be useful in future research projects or in analyzing other kinds of texts?
Explore the wonderful world of Ngrams
Google’s Ngram Viewer displays the frequency of worlds over time by drawing on the massive Google Books corpus, which includes the text of more than 15 million books. For more on Ngrams, check out the Culturomics site. Choose several of the words you’ve concentrated on in your previous analyses and enter them into the Ngram viewer. Look at the frequency of those words through time, paying particular attention to their frequency when your chosen novel was published. Do any of them stand out, either as particularly common words during their time or, perhaps as interestingly, as particularly uncommon words during their time. Try a few more words from the frequency lists you generated in Voyant earlier. Then, try comparing some of the keywords from your chosen work with some keywords from your key work—do any interesting comparisons emerge? The big question here: can a tool like the Ngrams viewer, which analyzes so many texts, help you understand anything about the historical place of a book you’ve never read?
Rerun your analyses incorporating your key work
Next, return to Voyant with the text of your key work. Run the same experiments with it that you ran with the work you’d never read. What textual trends emerge from your analysis of the key work? Does Voyant reveal anything about your key work that surprises you—a word or phrase you would not have predicted to be as prominent as it is, perhaps? Try running both texts through Voyant together (a corpus of two works). What does analyzing the texts together reveal?
During the upcoming week…
Read the first chapter
Now that you’ve not read the entire work, go back and actually read its first chapter or section. Did the textual analyses you performed prepare you to understand the themes, character, setting, or any other aspects of this first chapter? Are there ideas you expected to encounter based on your textual analysis, but didn’t? Were there ideas in the first chapter that seem entirely unrelated to the analyses you performed beforehand? If you have time, read further, keeping the same questions in mind.
Write a Short Reflection
Finally, you’ll write a paper about what you did and what you learned. Please keep the emphasis on what you learned: a) about your chosen text, b.) about your key work, and b) about this kind of “distant reading.” I’m interested in your speculations, your thoughtful reflections on text analysis. Grades will be based on how thoughtfully you engage with the assignment and how clearly those thoughts are expressed in prose. You do not need a central argument (although it’s fine if you have one.) The goal of this assignment is to ruminate on what kinds of knowledge a distant reading can or cannot produce. In other words, it encourages you to think about how textual analysis changes our attention to texts. A good paper can have lots of unanswered questions. Good questions are evidence of thoughtfulness.