Friday, February 7, 2014

Victorian Women Writers Project

After searching through the list of projects using the TEI encoding scheme on the TEI website, I came across one that I thought would be a good example to share. In 1995, Indiana University began a project titled, The Victorian Women Writers Project. The goal of this project was to expose “lesser-known British women writers of the 19th century, writers whose popularity did not make the transition into the 20th century or inclusion in a literary canon”. At the start of the project the focus was on poetry, but then grew to include various other genres such as novels, autobiographies, and lectures. The collection grew over the first few years to include up to two hundred texts as part of its collection.

Here is the link : http://webapp1.dlib.indiana.edu/vwwp/welcome.do

On the front page of the website it states that “the project will devote time and attention to the accuracy and completeness of the texts, as well as to their bibliographical descriptions. New texts, encoded according to the Text Encoding Initiative (TEI) P5 Guidelines, will adopt principles of scholarly encoding, facilitating more sophisticated retrieval and analysis”. And after exploring the website it is easy to see that that is exactly what they have done with this ongoing project.

In the “Project Information” area on the website it lists information regarding encoding and copyright. It describes the implementation process starting with the texts being produced by transcription and then originally being encoded in SGML. The overview then goes on to explain the text was then encoded using TEI, version P3 TEI Lite DTD (version 1.6).  As TEI changed over the years the project followed close with it to ensure the coding was up to date with the most current version. They also note the importance of “required manual intervention to address aspects lost in translation”, which I am assuming is an ongoing issue for them as they follow the encoding upgrades. It would have been nice to see a section devoted to the challenges they faced while going through numerous coding upgrades, but they do not really go into detail regarding those issues. Which does make sense because their main goal is to provide access to rare and overlooked text, not to explain the sometimes difficult task of encoding it. At the end of the encoding overview it also states that the project currently relies on a custom W3C schema.

The website is extremely easy to navigate and your attention is drawn to the fact that they include three view options for each text, which include, text mode, entire document, and XML. Users can view the XML versions of all the varying types of text in their entirety. Although the website does not offer any in depth information regarding their specific encoding strategies, it was really interesting to search through the different text types to get an idea of how XML changes with different genres.



1 comment:

  1. As a bit of an aside, we have a PhD student in our faculty who is expanding her master's thesis on Victorian publishing houses that published (only?) women writers. Listening to her talk about it is very interesting. I'm going to pass on this post to her.

    ReplyDelete