Archive for the ‘Uncategorized’ Category

The humanities research process – what could the future look like?

Friday, May 30th, 2008

Looking at the range of interests represented so far on the blog, I also wanted to share an idea that caught my imagination raised recently by Geoffrey Rockwell, digital humanist and TAPoR director, at last week’s New Horizons in Teaching and Research 2008 Conference at the University of Virginia.  The humanities research process has made quantum leaps in terms of widespread access through mass digitization efforts such as Google Books and the Internet Archive, and the development of citation tools like Zotero and text analysis tools.  These enabling tools have and are making significant impact on the discovery and selection stages in the humanities research process.  These however are discrete steps in the whole process.  Geoffrey envisioned the day when there would be a comprehensive tool or suite of tools that would carry research data from the very beginning stages of search & discovery, through selection, text analysis, and right through to publication.  He painted a future where humanities scholars can move and relate research material through the entire research cycle, not just portions of it.  What would a tool or suite of tools like that look like?

Research commons for scholars

Friday, May 30th, 2008

I’m intrigued by Chris Blanchard’s Pronetos project.  At the International Center for Jefferson Studies (ICJS) at Monticello, we’ve been exploring ways to build an online community of Jefferson scholars, historians and research fellows (past, present & future), where they can identify and link up with other scholars working in similar topics relating to the life, times, and legacy of Thomas Jefferson.  We think of it as an extension of the physical space and community we provide at Kenwood for scholarly exchange and discourse, and a means for fellows and scholars from all over the globe to continue conversations beyond their time at ICJS.  We’ll like to see a research commons emerge that incorporates collaboration, sharing of sources, critiquing of draft papers, joint development of conferences & symposia, reviews & recommendations, a repository of research papers, toolkits for historical analysis, etc.

I can definitely see the potential of creating a social network built around a specialized focus, but linking out to a wider network of Early American historians, and then also to scholars in Pronetos and other scholarly social networks.

I’d be interested to learn from folks about other F/OSS like Pronetos out there.  What do folks think about adapting Facebook, or Mediawiki to do something similar?  Is there a good way to manage different discussion threads over time so there’s some coherence?  How do we encourage scholars who are less comfortable with technology to participate?  How do we incentivize participation, contribute content, and share research?  In other words, how do we enlarge participation beyond the 20% who contribute 80% of the content?  And how do we remove real and perceived barriers to participation from the remaining 80% of folks whom we want to draw in?

Development practices

Thursday, May 29th, 2008

This probably falls at least partly under the general heading of sustainability, but I would be interested in a mini-session or discussion about development practices and patterns for small teams in academic settings.

My development team was recently expanded from myself to myself and two undergraduate programmers, and we’re currently in the process of setting up a system with a few basic tools: Mercurial for distributed revision control, Trac for bug tracking, task management, and documentation, and some ad-hoc mod_rewrite sorcery so that we can easily deploy and try out our own revisions and each other’s.

I would be curious to hear about other people’s experiences in similar situations. What worked? What didn’t?

a moveable feast

Thursday, May 29th, 2008

I’m coming late to the blog party, and can’t believe what an amazing group of people we have here! Dan, can we all crash for the week?

My initial proposal to THATcamp was to set up a kind of birds-of-a-feather session on policy and management issues around open source development in higher ed — so I was glad to see that Tom is thinking more broadly, but along similar lines. (And of course all the sustainability talk fits right in here.)

My department at UVA supports and contributes to a number of open source faculty projects, and we also have a few of our own going on right now: Blacklight (which my colleague Bess may present), Fathom (a kind of showcase/social networking portal project being built, at least initially, for the digital humanities community at UVA), and a new, still nameless, web-services framework for delivering GIS data for a variety of scholarly applications. (Can’t link to the latter two yet; developers would squeal, but sneak peeks are possible.)

These are three projects coming out of the same lab, but with radically different institutional / policy-level situations regarding their open-source status. We’re in a situation where patent and IP policies designed for big pharma can squelch digital humanities development without even noticing. It’s a vexing issue at UVA — and I suspect more broadly, too. Would anybody be interested in helping me do a kind of a survey and see if we could share approaches, successes, horror stories, etc?

Some other thoughts: we’re working a lot with geospatial data in collaborations with faculty and also in figuring out how best to manage and deliver library GIS collections at UVA. I’m a geospatial neophyte (suddenly managing GIS projects) and am eager to learn from Sean and others with more experience.

The temporal is the next dimension poised to smack us in all these geo-referenced projects, and we’re keen to explore some of the special problems around representing time in humanities data. Along those lines, maybe as a part of a session on historical visualization, I’d be happy to share some experiences from the late, lamented Temporal Modelling Project I undertook with Johanna Drucker about six or seven years ago. This was an attempt to create a visual “language” for expressing the kind of inflected temporalities you see in literary and historical documents. Can you put impatience on a timeline? What about déjà vu? Foreshadowing? Regret? (Temp Mod is also an example of an abortive DH project. Why are there so many? Foreshadowing? Regret?)

On with the random notes: if somebody can re-energize me about gaming in the humanities, please do! I used to teach (and do) game development at UVA, but I think I got Ivanhoe‘d out.

Finally, count me in on visualization and aggregation — Jeanne and Laura’s conversation about federating archival data and what you do with it once it’s all there. Collex and NINES have been fruitful, but I’m ready to imagine some next steps.

Introducing Encyclopedia Virginia

Thursday, May 29th, 2008

Though I’m way late in doing so, I did want to just introduce my project to the folks here before we all converge (if only hours before). I don’t think it’s covering any drastically new ground but maybe ties in to a number of the different conversational threads that have been going through these posts. I’m happy to demo what we’ve got this weekend, but equally happy to watch & learn from the audience.

Encyclopedia Virginia is a new project of the Virginia Foundation for the Humanities. Quite a few different state and regional encyclopedia projects have cropped up over the past decade; EV is one of the first to do so with a mandate to create entirely new entry content instead of simply publishing online a preexisting print encyclopedia.

We got charged, as I’m sure most of you have been, with creating a web project that would take advantage of the latest & greatest web technologies while also building itself for longterm sustainability (seeing as how an in-depth state encyclopedia like ours could be 10 years in content development alone, so we’ve got to have technology that can nimbly adjust to changing web standards, trends, etc.).

To that end, we’re borrowing a few tricks from digital libraries and archives, and encoding our entries in TEI. In some sense it’s overkill — this content is all digitally-born, so much of TEI’s capabilities w/r/t annotating archival manuscript is lost here. Hopefully, what it empowers down the road is some interoperability between EV content and other regional encyclopedia or digital library content, and some small immunity to the changing web trends over the long course of our content development.

We’ve built a custom CMS that ingests TEI and strips out various elements into your standard MySQL database for web delivery. We perform a similar task for our media objects, creating METS records for each object which the CMS ingests and strips apart. While, again, this in some ways constitutes quite a bit of overkill, it makes more sense when we try to think about the project as both an online encyclopedia and a digital library, and we’re hoping that the flexibility and openness offered by XML will reap benefits for us down the road.

So, a few different things I’d love to talk about over the course of the weekend (not including all of the great things I’ve already read — my curiosity and interest are piqued!):

  1. What is our responsibility vis-a-vis creating content that is accessible with the technologies of both today and tomorrow? How do we build digital creations that can themselves be lasting archives?
  2. Are archival standards like TEI and METS appropriate for digitally-born content? Obviously, EV is doing this, but I don’t think it’s a given that it’s always the right choice.
  3. (one close to my heart) What is the responsibility of digital archives w/r/t copyright and intellectual property? I manage EV’s media objects, finding things we can use in all kinds of archives, and struggle with this every day — as I try to get as many objects as possible delivered to the public, without getting sued. What role can humanities institutions and projects play in this culture battle? I think particularly there may be some overlap with the interest in Creative Commons that was expressed earlier.

See you on Saturday!

Traveling from the northeast

Thursday, May 29th, 2008

A side-note to travelers: I heard a report on the radio that the Woodrow Wilson Bridge is going to be down to one lane this weekend.  If that’s right, anyone is coming in on I-95 south might want to look for non-95 routes.  Anyone know more about this?

Scholarship and Digital Humanities

Thursday, May 29th, 2008

I mentioned this in my earlier post–there are many faculty grappling with how to define and evaluate the quality of applied, public, collaborative, and/or digital scholarship. The digital work takes many forms including publishing a monograph as an electronic book, developing research tools and models, blogging, building Web resources for education, and producing public projects like Mark’s Euclid Corridor Oral History Project in Cleveland. I’ve written about this a bit on Tellhistory. I would like first to learn, from those more directly involved, about broader digital humanities initiatives on this front and to discuss what more needs to be done. When departments with public history graduate programs do not recognize traditional peer-reviewed print publications about public history as scholarship — it seems like there is a lot that needs to be done to support the greater emphasis on methodology, collaboration, and organization that Tom Scheinfeldt addressed in “Sunset for Ideology, Sunrise for Methodology?”

Web mashups

Thursday, May 29th, 2008

If there is interest, I’d like to have a session on web mashups:  what they are, how you can make them, and specifically, how they can be applicable to the humanities.  For instance, one specific application I’ve been focused on recently is that to integrating such tools as Flickr and other applications into the classroom for the teaching of art history.  Another application is how to turn Zotero into a mashup platform.

Beyond citation and search

Wednesday, May 28th, 2008

My original THATCamp proposal mentioned some playing around I have done under the inspiration of Bill Turkel’s mapping of libraries that hold copies of William Cronon’s Nature’s Metropolis.

I adapted Bill’s Python scripts to screen-scrape Worldcat holdings records for a number of thematically related publications, and then made a script to create some PDF maps using the GMT software (Generic Mapping Tools) rather than Google Maps. I now have a couple hundred publications in a MYSQL database and have set up a small project using the Django development framework to browse the information, although I haven’t incorporated mapping into the framework. This is a personal hobby project at the moment, and I have not yet had as much time as I hoped to develop it further.

Although I could demo GMT and Django at a basic level (I’m no expert) if there is any interest in them, the larger issues I’m interested in discussing concern aggregation and visualization to support inference. Some clear connections here are Laura Mandell’s Archive Aggregator, Jeanne Kramer-Smyth’s Visualizing Aggregated Data, Tom Scheinfeldt’s Challenges to Historical Visualization, and Karin Dalziel’s Search and Digital Projects.

It’s exciting to see how much is going on with visualization and with data linking, and how swiftly the barriers to entry are dropping. We need experiments of many kinds, including free-form play, to figure out what the tools are good for. Without limiting the experiments, I’m interested in thinking about how it is that a visualization or data pattern can come to mean something and support some kind of inference. That’s different from search.

Within the area of bibliographic data of various kinds, what starts out as “metadata” (created with a particular context in mind of search and discovery, description and access) can have an additional role, provisionally, as primary data. When I do a search in a library catalog, or a timeline visualization in Zotero, although that can be a means of discovery for particular items, and it may not need to be anything more than that, it can also be a direct provocation to interpretation—if I see a pattern, and if there is some historical hypothesis that can explain that pattern. How can we think more clearly about when such inferences are warranted, and what information researchers might need to better evaluate them? And what would it take to develop the standards and tools to better enable this kind of exploitation of existing data? I would like our library catalog searches not to return twenty results at a time with a “next” button, but to offer the full set of results in a single standard form to be downloaded or piped directly to whatever other tools we might have for further processing.

In addition to this kind of discussion, I’m very much interested in digital civic engagement, in sustainability and project management, and in the demos and tools discussions, and in the RDF-related presentations. I posted a comment to Tom’s “event standards” post that I think bridges between the point made above and my novice curiosity about RDF.

There is such a range of interesting stuff here! Enough to go all week and not run out. I’m looking forward to meeting you all.

Critical Video Editions, Timelines, Maps, and Text Mining

Wednesday, May 28th, 2008

Wow, I am recently back from a long vacation and reading up on all the posts. This weekend should prove very interesting! Several posts have resonated with projects I’m designing at Alexander Street Press, a scholarly publisher of online databases in the humanities. I know I’m rather late to suggest a session, so I’ll be happy to play it by ear and demo what we are up to in relevant sessions. Here is some background before Camp:

  •  Krissy’s post on oral histories and Vertov was relevant to what we are working on in video. In opera, dance, and history, we are developing Critical Video Editions that allow for a more scholarly analysis of video. The problem with most video currently online is that access is almost always at the full work level or at the small clip level with no context of the overall piece. Just as technology enables text analysis, data mining, and other advanced research with texts, we are aiming to create tools to enable that kind of study with video. In particular, we are working on ways to clip, annotate, and segment video at a more granular level as well as enable searching on the subtitles or transcript of a video. I’ll be happy to share a beta of what we are working on, and I’d love to see other ideas.
  • Several posts mention visualizing time and place. We have implemented the Simile timeline in The Gilded Age and would like to learn more about how others are using timelines. We are in the planning stages integrating our content into Google Maps, especially with historical letters and diaries and local history images. Tom, Sean, Anna, and others all touch on aspects of this. I’m wondering how others have dealt with a large amount of information on a map (how to represent 500 letters from a single town, for example.) We are also playing with the intersection of space and time (letters over time in a city, etc.). I’d love to see how you are thinking through these issues.
  • Text mining is of particular interest, especially as described in Rob’s post. We are beginning similar experiments with our nineteenth-century American documents. In particular we are looking at how the controlled subject vocabulary we’ve developed in our Civil War Letters and Diaries database can be used as training data to mine for dates, events, and people in our Illustrated Civil War Newspapers and Magazines database. We are collaborating on this with ARTFL at the University of Chicago and are just in the beginning stages.

My background is in instructional design, so I’m personally curious about the pedagogical implications of all our ideas. How can a technology create a way of teaching and/or learning that previously wasn’t possible? How are we advancing scholarship and learning?

I look forward to meeting you all Saturday.

 Andrea