Skip to content

Advanced Digital Editing April 2022

Gabriel Bodard edited this page Jun 9, 2023 · 39 revisions

Advanced Digital Editing: a 3-day short course

Instructors: Dr Gabriel Bodard and Dr Christopher Ohge

26, 27 and 29 April 2022, 15.30–17.00 GMT

Have you taken an introduction to the Text Encoding Initiative (TEI), but are unsure of what to do next? Have you started a digital edition, but want to learn more about customisation, querying, and transforming your TEI into publishing formats? Taught by two scholars with years of practical editing experience, this advanced digital scholarly editing module can help take your digital editing further, showing through a mix of asynchronous materials and hands-on online workshops on how to implement advanced computational methods. After a brief refresher on TEI encoding, students will focus on XPath, XSLT, and publishing tools, the building blocks of querying and transforming TEI data.

Please note: Students in this module must have a basic understanding of TEI-XML (for example, previous TEI and digital editing short courses at SAS, EpiDoc workshops, and other DH workshops).

Software

Files for practice (download and extract to new folder)

  1. Sample XML files
  2. XSLT and XML bundle

Course format

This short course involves a combination of asynchronous and synchronous (live) sessions all hosted virtually. You must watch the relevant video tutorials and practice the exercises before each live session, at which we will review and answer questions, and discuss any other issues that arise. The zoom sessions will be workshops, not lectures.

Schedule

Before the workshop:

If you need to refresh or revise your TEI / EpiDoc XML knowledge you can find tutorials and other training materials at:

The following tutorials on TEI schema and ODD customisation are optional, but we include them because customisation is an important aspect of advanced editing, and it is a good way to get you thinking about TEI data models:

Day 1: Tuesday April 26, 2022

15:30–17:00 (UK time): live zoom session: introductions, discussion, TEI refresher exercise.

To watch and practice before session 2:

Notes from the first session

Day 2: Wednesday April 27, 2022

15:30–17:00 (UK time): live zoom session: exercises and questions on XPath

Exercises Part I:

Aims of XPath Exercise

  • Gain familiarity with traversing and searching your XML tree
  • Understand the basic syntax for XPath functions
  • Understand how to generate statistics about your document using XPath

(Use the bad-hamlet.xml file)

  1. Path expressions
  • Write an absolute path that finds all speech elements
  • Write an absolute path that finds all role attribute values in the cast list
  • Write a relative path that finds all speaker elements
  • Write a relative path that finds all who attributes
  1. Axis expressions
  • Write an axis expression that finds all sibling elements of second-level divs
  • Write an axis expression that finds all parent nodes of stage elements
  • Write an axis expression that finds all speeches that come before or after a Hamlet speech.
  1. Predicates
  • Find all speech elements for all speakers except Hamlet
  • Find all speech elements by Hamlet and Ophelia
  • Find the last line of each speech by Ophelia

Exercises Part II:

  1. Functions
  • Write a function to count speakers
  • Write a function to list only the distinct speakers (so a list of the speakers)
  • Write a function to return all lines in speeches that contain the string ‘Hamlet’ (except speaker elements)
  • Find the string length of each of Hamlet’s speeches.
  • Write a function to return all first lines of speeches greater than 100 characters
  • Calculate the average character count of Hamlet’s speeches.
  • List the distinct values of each speaker (i.e. list of characters) in Act 1

Bonus exercises: XPath Builder

  • Use the previous expression to build a list of distinct values of each speaker separated by commas
  • Write an XPath expression to generate an alphabetised list of words spoken by Ophelia that came after ‘I’.

Exercise answers

Find the answers to the XPath exercises here.

To watch and practice before session 3:

Day 3: Friday April 29, 2022

15:30–17:00 (UK time): live zoom session: exercises and questions on XSLT; feedback

XSLT exercise 1: “Push!”

  • Take the file Dawn-1-1-1.xml from /xml directory
  • Take the stylesheet transformer.xsl from /xslt directory
  • Create a new Oxygen transformation scenario to apply the stylesheet to the xml file
    • What do you see? Why?
    • What would you like to see?
    • What templates do you need to add to get there?

XSLT exercise 2: “Pull!”

  • Take the xml file bad-hamlet.xml from the /xml directory
  • Starting from the transformer.xsl and cruncher.xsl stylesheets, which we used on the Dawn file, create a new stylesheet to look for-each <castItem>
  • Create a list of all the lines spoken by each cast member
    • Can you find unique lines?
    • Can you sort lines alphabetically?
    • Can you think of anything else useful to do with them?

Example solutions for exercises:

Other resources: