Day of DH 2010: What I’m working on today

Today will be a bit atypical for me.  It’s spring break here at UNL and I’m not teaching classes this semester, so I will spend the day blissfully holed up in my home office.  I plan to spend most of the day working on my book, in particular a chapter on 18th- and 19th-century collected editions (as opposed to later chapters on digital editing), so my work today will seem much less digitally oriented than usual, when I’m in the midst of teaching a digital humanities course and writing about digital texts.  I am, however, leading an independent study with three fantastic undergraduates who are working on text encoding, and I want to comment on some of their XML before we meet next week.  I might work some of that in today.

I have spent the last few weeks combing through digital and print bibliographies trying to determine the earliest instances of collected editions of American authors.  This has raised a few issues about digital scholarship that I find interesting.  First, my generally excellent university library does not subscribe to the digital Evans Early American Imprints, the natural starting place to look for this kind of thing.  Because I work at a university deeply committed to digital research in the humanities, I am lucky enough to only rarely find myself on the other side of the “digital divide.”  I spent a good deal of time last week combing through the print volumes of Evans and Shaw-Shoemaker, frequently reflecting on the fact that a mere password–albeit one that costs somewhere between $20,000 and $100,000–separated me from a database that would allow me to perform this search in mere minutes.  (See the interesting though now 5-years-old exchange on this topic at History Matters).

Researching early American imprints has also made me reflect again on mass digitization projects.  A collected edition is an amorphous genre, and it is quite difficult to nail down a definition.  Collected editions are, quite simply, not something you can easily search for.  I’ll spare you the details, but my WorldCat research (an invaluable addition to Evans and Shaw/Shoemaker) has required looking for several different phrases that *may* indicate that a volume is a collected edition, usually resulting in hundreds of returns per decade (even after limiting results) of which less than 1% turn out to be relevant to my study.  These searches–full of false positives and faulty or redundant records–have highlighted for me the importance of old-school, scholar-led bibliography.  On the other hand, I have been thrilled to find that so many of these early editions have already been digitized by Google Books (not to mention Readex), so once I know exactly what I’m looking for, I have had the good fortune of finding many of the volumes free online.  (And so far no metadata trainwrecks like this one).

So far, even though I’m writing a chapter about print editing, I have been constantly aware of my position at a unique historical moment in which an unprecedented amount of bibliographic information is available to me, yet much of it is out of reach because of paywalls and unreliable or incomplete data.

UPDATE

After I posted this, David Loiterstein, who works for Readex, sent me a very nice email and generously offered me free temporary access to their Early American Imprints Series while I do my research.  (Full disclosure: I wrote a piece for them a couple of years ago, but I don’t think that influenced his offer.)  We had an interesting conversation about the problem of the digital divide.  David explained quite persuasively that Readex has no motive to price their collections beyond what universities can afford–as he put it, “What would be the point?” He said that Readex really wants to work with institutions who make a subscription to their collections a priority in order to find a way to provide access.  He thinks that if more faculty understood that, they would perhaps press harder to get their libraries to subscribe and eventually they could probably work something out.

I am sympathetic to David’s position.  I wish it were the case that all of our digital resources could be free to all users, but as we all know free resources are not really free and the money has to come from somewhere.  From my perspective Readex’s fees seem high, but I really have no idea what their profit margin is and whether or not these high fees are reasonable, and I do find David’s point about overpricing persuasive.  To me the important question is how to break the impasse.  I am an assistant professor in my first year at my job. I am not inclined to badger the library to pay more than my annual salary for a resource when I don’t have the whole picture and don’t have much influence anyway.  I am, of course, also sympathetic to my library, who I imagine would counter that these resources simply cost far beyond what they can budget.  And when we have just received news of the university’s budget being slashed by millions of dollars, resulting in layoffs, I just can’t in good conscience presume that my research needs should be their highest priority.

Of course this is a well-worn problem and I don’t have a good solution for it.  It seems worthwhile to ask around here to get a better sense of the library’s position (and I admit I have not done this yet) before assuming it’s impossible for us to ever subscribe.  I do think it’s worth bearing in mind, though, that it’s not helpful or particularly fair to vilify companies like Readex, and I say this as someone who is thoroughly committed to open-access in my own work on an archive and a digital journal.  Readex has been in business in some form since the 1940s, and as the microfilmers of the invaluable Evans and Shaw-Shoemaker materials–which most of us probably take for granted as a “free” resource at our research libraries–they are largely responsible for these materials being available to the average academic.  This is not a radical position, of course, and the Text Creation Partnership has been premised on collaboration between non-profit and for-profit digitization efforts for a decade now. Anyhow, my intent was not to rehash this issue, though it is still alive and well and quite relevant to researchers like me and to companies such as Readex.

Digital humanities is not an outdoor sport.

Albino Cave Scorpion

On that much we can all agree.

I just tested this hypothesis. It’s sunny and in the 60s here in Nebraska today, the warmest it’s been in at least four months, so I thought I’d try to work on my porch. 30 minutes later and I am blinded from the glare, possibly sunburned, and need to retreat to my darkened lair.

Clusters

Placeholder: write a post later about Wittgenstein’s “family resemblances” and clusters (as opposed to classes) as useful ways of thinking about genre and bibliographic entities.

Credential creep

Okay, shifting gears away from book-writing to an almost overdue abstract for an article I’m co-authoring with Dot Porter for Bethany Nowviskie’s collection of essays on alternative academic careers. The topic is “credential creep.” I will explain more later if I have time.

TEI hazing

I adore my independent study students, but when I assign them this as part of their first foray into TEI they probably don’t know that.

TEI training

Just rounded up some materials I use to teach TEI to send to a colleague who will send them to someone curious about TEI. I’m reflecting again on the improving but still underdeveloped pool of TEI training materials out there. I also realized how much of what I do while teaching is verbal and interactive and not reflected in my written materials.

Editing American authors

Back to piecing together a chronology of the early history of editing American authors for my book (still about 165 years away from the first digital edition of an American author!); waiting for the ball and chain (husband, preschooler) to get home.

Shutting down

I will now go outside and take advantage of the 65-degree whether before temperatures plummet and we get another 2-4 inches of snow tomorrow!