Empires of Information: Text Mining Behind a Pay Wall

15-20 Minute Paper

Timothy Bristow and Jim Clifford
York
 University

Biographies
Timothy 
Bristow
 was 
appointed
 Digital
 Humanities
 Librarian
 at 
York 
University’s
 Scott
 Library 
in 
2011. 
As 
his 
title
 suggests,
Timothy 
is 
principally 
concerned
 with
 the
 development
 of
 library 
systems
 and
 practices
 to 
facilitate 
new
 modes 
of 
digital
 scholarship. 

Timothy 
also
 serves 
as 
the 
Chair 
of 
York
 University 
Libraries’ 
Scholarly 
Communication
 Group.

Jim 
Clifford 
is 
 a 
postdoctoral 
fellow 
working
 with 
Dr. 
Colin 
Coates, 
on 
a
 collaborative 
research 
project, 
Trading 
Consequences,
 which 
has 
funding
 from 
a
 Digging 
into 
Data
 grant.
 Early 
in 
2011, 
Jim
 completed 
his 
PhD 
in 
the 
History 
Department
 at 
York
 University 
in
 Toronto.

Abstract
Text 
mining 
offers the 
promise
 of a macroscopic 
approach
 to 
historical
 inquiry. 
With
 the
 support 
of 
the
 Digging
 Into 
Data 
Challenge 
and
 other 
programs, 
 a 
growing
 number
 of 
historians
 are 
assembling
 and
 mining
 large 
textual
 corpora 
in 
an
 effort
 to
 identify 
novel 
patterns
 and
 explore 
new 
hypotheses. 
Our
 project, 
entitled
 Trading
 Consequences 
(http://tradingconsequences.blogs.edina.ac.uk/),
 uses
  text 
mining
 to
 investigate 
the 
environmental 
and 
economic
 impact
 of 
global
 commodity
 trading 
in
 the
 British 
world 
of 
the 
nineteenth 
century.
 However,
 in 
attempting
 to
 apply 
digital
 methods
 to
 explore 
imperial
 commodity
 circulation,
 the
 Trading 
Consequences
 team
 found
 our 
efforts 
restricted 
by 
the 
terms 
of 
the
 academic
 publishing 
industry
 and
 its
 control 
over
 the 
circulation 
of 
textual 
data.

Vendors 
and 
their
 proprietary
 systems 
present
 both
 explicit 
legal
 barriers
 and 
implicit
technological
 barriers
 to
 the
 effective 
use
 of
 the 
underlying 
data 
for 
the
 purposes
 of 
text 
mining
 and
 other 
emerging 
modes 
of 
digital 
inquiry. 
If
 libraries 
wish 
to
 support
 work
 in
 the
 digital
 humanities 
and
 better 
articulate 
their 
role 
within
 the 
field,
 these 
barriers
 must
 be 
removed. 
After 
negotiating 
the
 widespread
 shift 
to
 digital 
resources
 over
 the 
past 
twenty 
years, 
librarians
 must
 now
 help
 to 
negotiate 
their 
effective 
use 
for
 digital 
scholarship.