Working with Beyond 2022: a plugin to markup regnal dates in TEI/XML

I’ve been lucky enough to have the opportunity to collaborate with the Beyond 2022 project. I’ve been helping historians and archivists on the project develop a TEI/XML schema that will be used to encode a modern English translation of original Latin documents. These documents include receipt rolls from the Irish Exchequer, something I’m familiar with after working with Prof. Brendan Smith at the University of Bristol. Part of the encoding includes marking up dates in the documents in a format that can be read by computers, and this blog post describes how I wrote a plugin for the Oxygen XML editor in collaboration with Dr Elizabeth Biggs, a researcher at Trinity College, Dublin, and The National Archives (UK), to make this process easier and less error-prone. It is a nice example of how a Research Software Engineer can make a positive contribution to a Digital Humanities project.

The problem: dates are hard!

In official records, such as the Irish Exchequer receipt rolls, dates are given as regnal years, meaning a date based on the current year of the English monarch’s reign. For example, 5 Henry VI means the fifth year of Henry VI’s reign, and this has a date range of 1 September 1426 (the date of Henry’s accession to the throne being 1 September 1422) until 31 August 1427. Thus, a date of 29 September 5 Henry VI would be 29 September 1426. Of course, it is more complicated than that, since a date would rarely appear as 29 September, but rather based around a feast day which, in this case, is Michaelmas. Some feast days like Michaelmas are fixed, meaning they always occur on the same date. So, Michaelmas 5 Henry VI and Michaelmas 6 Henry VI are the 29 September 1426 and 29 September 1427, respectively.

However, Easter, and feast days anchored to Easter, are movable, i.e., they are on different dates each year. The calculation of Easter, called a comptus, is complicated but often described as occurring on the first Sunday after the first full moon after the spring equinox (21 April). Thus, Easter 5 Henry VI occurred on 31 March 1426, while Easter 6 Henry VI occurred on 20 April 1427. Some dates of feast days are linked to Easter: Ash Wednesday occurs 46 days before Easter and marks the start of Lent. In contrast, Quinquagesima occurs on the Sunday before Lent, while Rogation Sunday occurs on the sixth Sunday after Easter.

To make things even more interesting, the date might be described as the octave or quindene of a feast day, being the eighth and fifteenth day after a feast, respectively, though because the first date is always calculated this essentially means a week or a fortnight in modern terms. In addition, ‘eve’ and ‘morrow’ might be added to indicate the day before or day after.

The classic reference book for working out historical dates is C. R. Cheney’s Handbook of Dates for Students of English History. First published in 1946, now in its second edition revised by Michael Jones, it has remained in print ever since. The book provides a list of regnal dates for each monarch and a table for each calendar year that can be cross-referenced for the date of Easter and other major liturgical feasts. There is also a comprehensive list of feast days.

In the TEI/XML we want the date in the YYYY-MM-DD format, with appropriate markup around the date given. For example, “quindene of Easter 3 Edward III” we will want the following markup:

<date type="regnal" when="1329-05-07">quindene of Easter <supplied>1329</supplied> 3 Edward III</date>

So, the modernised date is given in the @when attribute. We keep the original text but add the year (1329) provided in a tag to indicate it’s an editorial addition.

Looking up the date and adding all this additional markup, in addition to translating the original Latin into English, was going to be time-consuming and prone to errors. It was agreed that it would be worth spending some time developing a plugin for the Oxygen XML editor to see if it could improve the accuracy and time in adding dates into the TEI/XML documents.

The solution

I developed a Java library called RegnalDate that can parse a string of text to look for a regnal date and return an object that represents that date. I also wrote an Oxygen XML plugin called RegnalDatePlugin that sends text to RegnalDate and then takes the object returned by the library and creates a TEI/XML <date/> element in the document being edited.

The plugin I developed uses several lookup tables internally. For the regnal years, we are only dealing with Henry III to Henry VII, the principal monarchs that the Gold Seam will be dealing with. Since I knew the start date of each monarch’s reign, I wrote a simple Python script that created a lookup table that has many rows that looked like this:


Which basically says that 1 Henry III runs from 28 October 1216 to 17 October 1217.

For the feast days most likely referenced by the Irish Exchequer, Elizabeth Biggs sent me a spreadsheet of fixed feast days as they will appear, e.g., ‘All Saints’, and the date they occur.

Screenshot of an Excel spreadsheet with feast days

Another Python script processed the spreadsheet to create a new lookup table with entries that looks like this:

All Saints:11-01

Here I know that All Saints occurs on 1 November each year.

Elizabeth also sent me a spreadsheet of dates between 22 March and 25 April and what years Easter occurred on that date.

Excel spreadsheet of Easter dates and years

And she also gave a list of movable feasts used by the Exchequer and how many days before or after Easter they occurred.

Excel spreadsheet of movable feasts

These were again processed by a Python script to create a third lookup table that, for each year between 1216 and 1509 (the calendar years that cover our regnal year range), we have the date of Easter and the movable feasts. For example:


In 1216, Easter occurred on 10 April, while Quinquagesima (Sunday before Lent) was on 21 February, Shrove Tuesday was on 23 February, etc.

RegnalDate was written to take a piece of text, e.g., ‘quindene of Easter 3 Edward III’ and parse it to break it down using regular expressions (pattern matching) into its component parts so that we can use the lookup tables described earlier. The three things we are interested in here are the regnal date (3 Edward III), the feast (Easter) and the modifier (quindene). By parsing this way, we can work out the date of Easter, 23 April 1329, and then add 15 days for the quindene, which gives us 7 May 1329.

In the Oxygen XML Editor, the editor of a document just needs to highlight the text, hit the command CTRL+SHIFT+R, and the RegnalDatePlugin will use RegnalDate to parse the text and supply the TEI/XML needed.

Here are some screenshots of text before and after the plugin was used.


Example of a date without TEI/XML markup


Example of a date with TEI/XML markup applied by the plugin

To ensure that both the lookup tables and parsing process are correct, I have a series of tests that check that I’m getting the expected results for different feast days and regnal years. So that I wasn’t marking my own homework, Elizabeth also sent me a number of regnal dates with their corresponding calendar date, and my code correctly parsed these as well.

Some thoughts

This project represents a nice collaboration between a Research Software Engineer, historians and archivists, providing a tool to improve the quality and efficiency of an editing process. After a recent meeting with the Beyond 2022 team, I was pleased to hear that Elizabeth had been successfully using the plugin. The plugin was written relatively quickly, and after re-reviewing the code, it could do with some refactoring to make it easier to read and be more efficient. A process that can be done with some confidence due to the test suite. It could also be updated to include other English monarchs, although I have reservations about King John, whose regnal years start on Ascension Day … a movable feast!

~ Mike Jones