Ore, Christian Emil: Interlinking source text collections – a Norwegian example

Posted by GV in Friday, Paper, Program, Proposals, Systems |

Norway may differ from other countries in that few charters of Norwegian origin are preserved and the complete printed Diplomatarium Norvegicum, DN, (1846 – present) comprises less than 24 000 documents. Many of the transcripts are based on copy books in the Vatican and other sources.  A newer series Regesta Norvegica, RN, with update source information and synopsis has been published and has reached 1419. DN has been retro-digitized 15 years ago. DN and RN are converted into XML and published on the web.  They are currently being converted into TEI-P5. Medieval Norwegian Text Corpus (Menotec) is a project aiming at establishing a 1.5 million words corpus of transcriptions compliant with Menota (www.menota.org) including a 0.5 million words treebank.  20% of the corpus are based on charter texts. Since DN is not on the level of modern text edition, we use texts from a collection of high quality transcriptions (CT) done in 1970-1990. All charters in CT are described in the Regesta Norvegica series and opens for interlinking the texts. The current web-version of DN and RN is based on ad hoc markup. To give open access to  the material, including the text corpus, it is necessary to use well defined formats like TEI and Menota. The metadata model both for the content and for the text versions is compliant to FRBR(oo) and CIDOC-CRM and thus easy to convert into RDF-format. This opens for interlinking with other sources collection for example via Linked Open Data.

Christian-Emil Ore,
Associate Professor
University of Oslo
eMail: c.e.s.ore@iln.uio.no

Comments are closed.