On Ignoring Encoding

Lately we’ve seen a spate of articles castigating the digital humanities—perhaps most prominently, Adam Kirsch’s piece in New Republic, “Technology Is Taking Over English Departments: The False Promise of the Digital Humanities.” I don’t plan in this post to take on the genre or refute the criticisms of these pieces one by one; Ted Underwood and Glen Worthy have already made better global points than I could muster. My biggest complaint about the Kirsch piece—and the larger genre it exemplifies—would echo what many others have said: these pieces purport to critique a wide field in which their authors seem to have done very little reading. Also, as Roopika Risam notes, many of these pieces conflate “digital humanities” with the DH that happens in literary studies, leaving digital history, archeology, classics, art history, religious studies, and the many other fields that contribute to DH out of the narrative. In this way these critiques echo conversations happening with the DH community about its diverse genealogies, such as Tom Scheinfeldt’s The Dividends of Difference, Adeline Koh’s Niceness, Building, and Opening the Genealogy of the Digital Humanities, or Fiona M. Barnett’s “The Brave Side of Digital Humanities.”

Even taken as critiques of only digital literary studies, however, pieces such as Kirsch’s problematically conflate “big data” or “distant reading” with “the digital humanities,” seeing large-scale or corpus-level analysis as the primary activity of the field rather than one activity of the field, and explicitly excluding DH’s traditions of encoding, archive building, and digital publication. I have worked and continue to work in both these DH traditions, and have been struck by how reliably one is recongized—to be denounced—while the other is ignored or disregarded. The formula for denouncing DH seems at this point well established, though the precise order of its elements sometimes shifts from piece to piece:

  1. Juxtapose Aiden and Michel’s “culturomics” claims with the stark limitations of the Ngrams viewer.
  2. Cite Stephen Ramsay’s “Who’s in and Who’s Out,” specifically the line “Do you have to know how to code? I’m a tenured professor of digital humanities and I say ‘yes.'” Bemoan the implications of this statement.
  3. Discuss Franco Moretti on “distant reading.” Admit that Moretti is the most compelling of the DH writers, but remain dissatisfied with the prospects for distant reading.

These critiques are worth airing, though they’re not particularly surprising—if only because the DH community has been debating these ideas in books, blog posts, and journal articles for a long while now. Matt Jockers’ Macroanalysis alone could serve as a useful introduction to the contours of this debate within the field.

More problematically, however, by focusing on Ramsay and Moretti, these pieces ignore the field-constitutive work of scholars such as Julia Flanders, Bethany Nowviskie, and Susan Schreibman. This vision of DH is all Graphs, Maps, Trees and no Women Writers Project. All coding and no encoding.

Continue reading

Mr. Penumbra, Distant Reading, and Cheating at Scholarship

My Technologies of Text course is capping this semester reading Robin Sloan’s novel, Mr. Penumbra’s 24-Hour Bookstore, which Matt Kirschenbaum deemed “the first novel of the digital humanities” last year. Mr. Penumbra is a fine capstone because it thinks through so many of our course themes: the (a)materiality of reading, the book (and database) as physical objects, the relationship between computers and previous generations of information technology, &c. &c. &c. I will try not too spoil much of the book here, but I will of necessity give away some details from the end of the first chapter. So if you’ve not yet read it: go thou and do so.

Rereading the book for class, I was struck by one exchange between the titular Mr. Penumbra—bookstore owner and leader of a group of very close readers—and the narrator, Clay Jannon—a new bookstore employee curious about the odd books the store’s odd club members check out. In an attempt to understand what the club members are up to, Clay scans one of the store’s logbooks, which records the comings and goings of club members, the titles of the books they checked out, and when they borrowed each one. When he visualizes these exchanges over time within a 3d model of the bookstore itself, visual patterns of borrowing emerge, which seem, when compiled, to reveal an image of a man’s face. When Clay shows this visualization to Mr. Penumbra, they have an interesting exchange that ultimately hinges on methodology: Continue reading

Omeka/Neatline Workshop Agenda and Links

We’ll be working with the NULab’s Omeka Test Site for this workshop. You should have received login instructions before the workshop. If not, let us know so we can add you.

Workshop Agenda

9:00-9:15 Coffee, breakfast, introductions
9:15-9:45 Omeka project considerations

9:45-10:30 The basics of adding items, collections, and exhibits
10:30-10:45 Break!
10:45-11:15 Group practice adding items, collections, and exhibits
11:15-12:00 Questions, concerns
12:00-1:30 LUNCH!
1:30-2:15 Georectifying historical maps with WorldMap Warp
2:15-3:00 The basics of Neatline
3:00-3:15 Break!
3:15-3:45 Group practice creating Neatline exhibits
3:45-4:00 Final questions, concerns
4:00-5:00 Unstructured work time

Sample Item Resources

Historical Map Resources

Omeka Tutorial

Neatline Tutorials

Model Neatline Exhibits

Representing the “Known Unknowns” in Humanities Visualizations

Note: If this topic interests, you should read Lauren Klein‘s recent article in American Literature, “The Image of Absence: Archival Silence, Data Visualization, and James Hemings,” which does far more justice to the topic than I do in my scant paragraphs here.

Pretty much every time I present the Viral Texts Project, the following exchange plays out. During my talk I will have said something like, “Using these methods we have uncovered more than 40,000 reprinted texts from the Library of Congress’ Chronicling America collection, many hundreds of which were widely reprinted—and most of which have not been discussed by scholars.” During the Q&A following the talk, a scholar will inevitably ask, “you realize you’re missing lots of newspapers (and/or lots of the texts that were reprinted), right?”

To which my first instinct is exasperation. Of course we’re missing lots of newspapers. The majority of C19 newspapers aren’t preserved anywhere, and the majority of archived newspapers aren’t digitized. But the ability to identify patterns across large sets of newspapers is, frankly, transformative. The newspapers that have been digitized under the Chronicling America banner are actually the product of many state-level digitization efforts, which means we’re able to study patterns across collections that were housed in many separate physical archives, providing a level of textual address not impossible, but very difficult in the physical archive. So my flip answer—which I never quite give—is “yes, we’re missing a lot. But 40,000 new texts is pretty great.”

But those questions do nag at me. In particular I’ve been thinking about how we might represent the “known unknowns” of our work,1 particularly in visualizations. I really started picking at this problem after discussing the Viral Texts work with a group of librarians. I was showing them this map,

which transposes a network graph of our data onto a map which merges census data from 1840 with the Newberry Library’s Atlas of Historical County Boundaries. One of the librarians was from New Hampshire, and she told me she was initially dismayed that there were no influential newspapers from New Hampshire, until she realized that our data doesn’t include any newspapers from New Hampshire, because that state has not yet contributed to Chronicling America. She suggested our maps would be vastly improved if we somehow indicated such gaps visually, rather than simply talking about them.

In the weeks since then, I’ve been experimenting with how to visualize those absences without overwhelming a map with symbology. The simplest solution, as almost always, appears to be the best.

In this map I’ve visualized the 50 reprintings we have identified of one text, a religious reflection by Nashville editor George D. Prentice, often titled “Eloquent Extract,” between the years 1836-1860. The county boundaries are historical, drawn from the Newberry Atlas, but I’ve overlain modern state boundaries with shading to indicate whether we have significant, scant, or no open-access historical newspaper data from those states. This is still a blunt instrument. Entire states are shaded, even when our coverage is geographically concentrated. For New York, for instance, we have data from a few NYC newspapers and magazines, but nothing yet from the north or west of the state.

Nevertheless, I’m happy with these maps as helping me begin to think through how I can represent the absences of the digital archives from which our project draws. And indeed, I’ve begun thinking about how such maps might help us agitate—in admittedly small ways—for increased digitization and data-level access for humanities projects.

This map, for instance, visualizes the 130 reprints of that same “Eloquent Extract” which we were able to identify searching across Chronicling America and a range of commercial periodicals archives (and huge thanks to project RA Peter Roby for keyword searching many archives in search of such examples). For me this map is both exciting and dispiriting, pointing to what could be possible for large-scale text mining projects while simultaneously emphasizing just how much we are missing when forced to work only with openly-available data. If we had access to a larger digitized cultural record we could do so much more. A part of me hopes that if scholars, librarians, and others see such maps they will advocate for increased access to historical materials in open collections. As I said in my talk at the recent C19 conference:

While the dream of archival completeness will always and forever elude us—and please do not mistake the digital for “the complete,” which it never has been and never will be—this map is to my mind nonetheless sad. Whether you consider yourself a “digital humanist” or not, and whether you ever plan to leverage the computational potential of historical databases, I would argue that the contours and content of our online archive should be important to you. Scholars self-consciously working in “digital humanities” and also those working in literature, history, and related fields should make themselves heard in conversations about what will become our digital, scholarly commons. The worst possible thing today would be for us to believe this problem is solved or beyond our influence.

In the meantime, though, we’re starting conversations with commercial archive providers to see if they would be willing to let us use their raw text data. I hope maps like this can help us demonstrate the value of such access, but we shall see how those conversations unfold.

I will continue thinking about how to better represent absence as the geospatial aspects of our project develop in the coming months. Indeed, the same questions arise in our network visualizations. Working with historical data means that we have far more missing nodes than many network scientists working, for instance, with modern social media data. Finding a way to represent missingness—the “known unknowns” of our work—seems like an essential humanities contribution to geospatial and network methodologies.

1. Yes, I’m borrowing a term from Donald Rumsfeld here, which seems like a useful term for thinking about archival gaps, while perhaps not such a useful term for thinking about starting a war. We can blame this on me watching an interview with Errol Morris about The Unknown Known on The Daily Show last night.

Boston DH Consortium Session #3 Breakout Group Notes

For breakout groups in the “Out-of-the-Box” DH Tools session at the Boston-Area DH Consortium Faculty Retreat (Fall 2013):


Oxygen/TEI BP





Creating a Historical Map with GIS

In the next few days I’ll be teaching a few workshops centered largely on teaching participants to georeference historical maps using ArcGIS. I’ll do this first at the Northeastern English Graduate Student Association’s 2013 Conference, /alt, and then at the Boston-Area Days of DH conference we’re hosting at the NULab March 18-19.

We’ll be learning a few things in this workshop:

  1. How to add base maps and other readily-importable data to ArcGIS
  2. How to plot events in ArcGIS using spreadsheet data
  3. How to georeference a historical map in ArcGIS

For that last goal, this step-by-step guide by Kelly Johnston should be your go-to reference. We’ll be following Kelly’s instructions almost to the letter, though we’ll be using different data.

We’ll be using these files for the lab. This tutorial, prepared for my graduate digital humanities class, walks through the same steps we’ll follow, in case you need to review a step here or later:

A few other worthwhile links:

  • The Spatial Humanities site is a useful clearinghouse of both spatial theory and praxis across a range of humanities fields. Kelly Johnston’s step-by-step above is only one of a growing collection of such resources on the Spatial site.
  • The David Rumsey Historical Map Collection. If you want a historical map with which to practice—or, frankly, for your research, this is an excellent first stop. In short, it’s many thousands of historical maps, provided for free. In order to download high-resolution versions of the maps, you must create a (free) account and log in.
  • Neatline is an incredibly robust Omeka plugin that allows you to create spatial exhibits of your collected materials. Check out some of the demos—it’s really phenomenal stuff. We won’t have time to go over Neatline, but one could, for instance, make use of a map georeferenced in ArcGIS as a base map for a Neatline exhibit.
  • Hypercities is another important spatial humanities platform that makes use of Google Earth and allows users to build “deep maps” of spatial data, historical maps, images, video, and text. Check out some of their collections to see what Hypercities can do. The collections around Los Angeles, Berlin, and Rome are particularly robust.

Finally, two spatial nonsequitors:

Mea Culpa: on Conference Tweeting, Politeness, and Community Building

Kathleen Fitzpatrick’s post “If You Can’t Say Anything Nice” post about public shaming on Twitter came at a timely moment for me. Describing the culture of Twitter commentary, she writes:

You get irritated by something — something someone said or didn’t say, something that doesn’t work the way you want it to — you toss off a quick complaint, and you link to the offender so that they see it. You’re in a hurry, you’ve only got so much space, and (if you’re being honest with yourself) you’re hoping that your followers will agree with your complaint, or find it funny, or that it will otherwise catch their attention enough to be RT’d.

I’ve done this, probably more times than I want to admit, without even thinking about it. But I’ve also been on the receiving end of this kind of public insult a few times, and I’m here to tell you, it sucks.

I read this post while at a conference, and as I read it realized that I’d been guilty of just this kind of ungenerous commentary earlier in the day. I’d disagreed strongly with one of the presenters and written a series of critiques on Twitter, which many in my community found pithy and retweeted. Let me say: I absolutely believed in what I wrote, and I don’t retract the ideas. But in the Twitter exchanges around those posts, some of the conversation got more personal. The presenter—a fellow academic and human being named Elaine Treharne, not some nameless person‐read those exchanges after the panel and was deeply hurt by them. She was right. I was wrong. I tweeted an apology, but the entire affair, coupled with Kathleen’s post, kept working on me. I ended up chatting with Elaine for several hours yesterday evening about electronic fora, professionalism, and valid critique through channels such as Twitter. I think we both learned quite a bit; I know I learned quite a bit. We still don’t entirely agree on the substantive points from her presentation, but I hope we’re now friends as well as colleagues. She agreed to let me use her name in this post.

After yesterday’s experiences and conversations, I spent the evening considering my tweets over the past several conferences I’ve attended, including in the much-ballyhooed “Dark Side of DH” panel at MLA in Boston. Kathleen is absolutely right: our field needs to seriously consider both how our current Twitter culture developed, and how it might need to change moving forward. I need to seriously consider how I engage with colleagues on Twitter; I am not blameless and I need to reform. This post is my attempt to start thinking through both how the current Twitter culture came to be and where how we might change. The post owes any of its insights to Elaine’s generous willingness to talk seriously with me about these issues after being flamed by my community on Twitter.

Only a few years ago, DH was still a fringe field, mostly ignored by academia more widely. DHers felt not like “the next big thing,” but like an embattled minority. The community was very small, and the worry at conferences was about how to convince our colleagues that what we did was valuable. How can we get hired; how can we get promoted? How can we persuade the field to pay attention to this work we find remarkable? DHers were overrepresented in online fora such as Twitter, though, which became a place to build support communities for DH scholars who felt isolated on their campuses and within the wider academic community.

Within that framework, the back-chatter on Twitter was a valuable support mechanism. I remember sitting in a conference panel in my disciplinary field—nineteenth-century American literature—a few years ago when an eminent professor described the utter vapidity of modern reading practices (uncharitably: “kids these days with their screens! and their ADD!) compared to those of 150 years ago. Around the room, heads were nodding vigorously, and in the Q&A many other prominent members of my field rose to concur.

In that room, I felt like the oddball. My intellectual interests were being dismissed out of hand by the very people likely to decide whether my work would be published (and thus, whether I would get a job, get tenure, &c., &c.). I disagreed with them vehemently, but as a junior scholar was hesitant to challenge the rising consensus in the room, for fear that would further isolate me. And so I turned to Twitter to remind myself that I did have a community who would welcome my ideas on these issues. I tweeted my frustrations—I conferred with my dispersed but friendly DH community—and found support and engagement. Perhaps this doesn’t excuse public snarkiness, but that snark was a way of building community—certifying the value of unpopular interests and opinions. None of the eminent panelists from that session I attended read those conversations, nor would have. Nobody got hurt, and I felt less embattled and more prepared to go on with my work.

But that was several years ago, when I had far fewer followers on Twitter, and when DH was not at the center of the academy’s attention. Today many more academics, including those not heavily involved in DH, are on Twitter. And rather than being an nearly-ignored, fringe element of the academy, prominent DHers are being looked to as gatekeepers into a much-desired field. Panelists know to investigate how their sessions were tweeted, and they care what was said about them online. What’s more, many of our colleagues now know how to find tweets about them even when those tweets don’t include their names or usernames. We cannot assume that anonymous tweeting will do no harm to the colleagues we discuss. Tweets are not semi-private, whispered conversations in the back of the conference room; our tweets are very public and could unfairly shape public perception of the colleagues we discuss in them.

Within this framework, the same kind of Twitter chatter that helped build DH communities only a few years ago can resonate with newcomers to the field precisely as that vigorous denunciation of “technology” resonated for me as a young nineteenth-century Americanist. In other words, Twitter chatter can easily read not as community building, but as insider dismissal and exclusion. Such exchanges belie claims that DH is an open field, instead alienating scholars attempting to engage with it. We are no longer the upstarts; we are increasingly seen as the establishment. While this perception doesn’t exactly line up with reality, it certainly shapes the way our Twitter conversations—and in turn the wider DH field—are perceived by newcomers to it. In Elaine’s case, she felt she was being dismissed out of hand by scholars whose work she knows and respects; we had convinced her that she didn’t belong in DH. This is a terrible outcome our field should be wary of replicating.

Nevertheless, I remain firmly convinced that Twitter conversations can supplement and enrich academic conferences, providing a record of their proceedings, allowing scholars to engage actively with their presenting colleagues, and providing access to conferences to those scholars who cannot attend. But as a community, we need to think hard about how to retain the value of conference tweeting while mitigating the alienating effects of conference tweeting on our colleagues. This does not mean, I think, refraining from any critique on Twitter, but will mean remembering when crafting those critiques that there are real people on the receiving end.

Principles of Conference Tweeting

Going forward, I’m going to try and tweet conference panels following these principles.

  1. I will post praise generously, sharing what I find interesting about presentations.
  2. Likewise, I will share pertinent links to people and projects, in order to bring attention to my colleagues’ work.
  3. When posting questions or critiques, I will include the panelist’s username (an @ mention) whenever possible.
  4. If the panelist does not have a username—or if I cannot find it—I will do my best to alert them when I post questions or critiques, rather than leaving them to discover those engagements independently.
  5. I will not post questions to Twitter that I would not ask in the panel Q&A.
  6. I will not use a tone on Twitter that I would not use when speaking to the scholar in person.
  7. I will avoid “crosstalk”—joking exchanges only tangentially related to the talk—unless the presenter is explicitly involved in the chatter.
  8. I will refuse to post or engage with posts that comment on the presenter’s person, rather than the presenter’s ideas.

I am not calling for an embargo on conference tweeting, or for engagements exclusively devoted to agreement or confirmation. To turn conference tweeting into a tepid, timid echo chamber would not serve DH or the wider academy. But as the DH field grows and newcomers attempt to engage with it, we must consider the effect our chatter might have on them. I don’t want to make newcomers to DH feel as isolated as I felt in that room of eminent Americanists. Changing my public presentation on Twitter seems a small concession—worth making—if it will prevent that happening.

Thanks to Flickr users digitalART2, exquisitur, and brx0 for the Creative-Commons photos embedded here.