Though its preliminary successes got a warm reception at the CSECS conference last October on an intimate DH panel (with Alison Muri, coordinator of the Grub Street Project), the Cross-Hopkins Diary Project is still very much a hatchling. I dove into TEI encoding with zero prior knowledge, and have really been learning as I go.

The ambitious goal of the project is to have a database of information that allows users to both close-read and ‘distant-read’ the records: we want the tags anchored to locations on the image, but also a flexible database of which plays were performed on what day and how much cash each show brought in to the theatre (so that exciting statistical analysis can be performed). Marginalia is a large part of the Diary’s neato-factor, so it would be great to have the complete notes tagged intelligently, too: people, places, organizations, and dates.

To give you an idea of what I’ve been doing so far, here’s a snippet of an image of the manuscript with its accompanying XML:

 

<row xml:id="r49">
      <cell role="production">49</cell>
      <cell role="date"><date when="1747-11-27">Fry 27</date></cell>
      <cell role="show"><title ref="#VEN">Venice preserv'd</title> + <title ref="#LOT">Lottery</title></cell>
      <cell role="take"><measure type="currency" unit="pounds" n="150">150</measure></cell>
     </row>
     <row xml:id="r50">
      <cell role="production">50</cell>
      <cell role="date"><date when="1747-11-28">Sat 28</date></cell>
      <cell role="show"><title ref="#PRW">P: Wife</title> + <title ref="#LOT">D<hi rend="superscript">o</hi></title></cell>
      <cell role="take"><measure type="currency" unit="pounds" n="170">170</measure></cell>
     </row>
     <row xml:id="r51">
      <cell role="production">51</cell>
      <cell role="date"><date when="1747-11-30">Mon 30</date></cell>
      <cell role="show"><title ref="#ORP">Orphan</title> + <title ref="#ANA">Anat</title></cell>
      <cell role="take"><measure type="currency" unit="pounds" n="100">100</measure></cell>
     </row>
     <add place="inline">
     <milestone unit="month"/>Dec<hi rend="superscript">r</hi>.</add>
     <row xml:id="r52">
      <cell role="production">52</cell>
      <cell role="date"><date when="1747-12-01">Tus 1<hi rend="superscript">st</hi>:</date></cell>
      <cell role="show"><del><title ref="#ORP">Orphan</title> + <title ref="#ANA">Anatomist</title></del>
     <add place="below"><title ref="#STR">Stratagem</title> + <title ref="#LOT">Lottery</title></add></cell>
      <cell role="take"><measure type="currency" unit="pounds" n="120">120</measure></cell>
     </row>
     <note place="opposite" type="aud"><name ref="#PRN" type="person" role="royalty">Prince</name> + <name ref="#PRS" type="person" role="royalty">P.</name></note>
     <row xml:id="r53">
      <cell role="production">53</cell>
      <cell role="date"><date when="1747-12-02">Wed 2<hi rend="superscript">d</hi>.</date></cell>
      <cell role="show"><del><title ref="#STR">Stratagem</title> + <title ref="#LOT">Lottery</title></del>
     <add place="below"><title ref="#REC">Recr: Officer</title> + <rs type="ent">Dancing</rs></add></cell>
      <cell role="take"><measure type="currency" unit="pounds" n="120">120</measure></cell>
     </row>
     <pb/>

It’s definitely not perfect, and I’m not even sure it’s totally TEI-adherent! I’ve adapted aome tags to suit my own purposes until I find out how to express what I really need (I’ve been using <milestone/>, for example, to identify holidays and months), and the structure doesn’t really give the reader a sense of the layout of the manuscript (particularly with the “note” that mentions the Prince and Princess’s attendance, which is on the opposite page in the diary but refers to the December 2nd performance – I know XSLT is my friend, but we’re still unacquainted). It’s far from anchored to the facsimile, anyway. Other problems include my dodgy uses of the ‘del’ and ‘add’ elements, and my somewhat blind use of the ‘ref’ element.

Anyway, it’s a work in progress. From my code so far I’ve created an OpenOffice spreadsheet of the tabular data, and have from that been able to extract some pretty interesting statistical findings (which my supervisor and I presented at CSECS). All this to say that I’m looking for a hardier option (Scripto or Scribe may fit the bill).