|
§ Summary
The primary goal of this research is to devise a "lossless" data representation for medieval chant. The target material is Western-European liturgical documents containing music notation and dating prior to A.D. 1550. Medieval musicologists estimate that over one million such documents survive. In the Middle Ages, chant melodies were written in a system of symbols called neumes. Neumes differ drastically from modern music symbols in both morphology and semantics. A "lossless" data representation is one that captures the semantic content of the source material in a way that will support all anticipated uses. This data representation will be in the Private Use Area of the Unicode Standard (ISO/IEC 10646).
The two main categories of use for this data representation have conflicting requirements:
- that it be possible to reconstruct from the data stream a diplomatic facsimile of a source document in its native notational style for purposes of computer display, editing, and printing; and
- that the data representation be sufficiently formal and abstract as to allow comparative analysis of melodies and notational styles across many documents from a data bank.
No data representation exists today that satisfies both of these requirements.
Encoding the data in the Private Use Area of the Unicode Standard has four benefits:
- a sufficiently large code space as to allow an individual character code for each of the hundreds of semantically distinct neume forms;
- the ability to mix neume data with chant text in a single file without resorting to multiple (i.e., modal) meanings for characters;
- suitability for Internet transmission and use in Worldwide Web applications; and
- the possibility of a standardized representation that is usable by programmes written in any programming language for printing, analysis, pedagogy, audio rendition, etc.
My research includes application of the data representation in the following areas:
- an editing program to facilitate visual data entry, with data output to this data representation;
- a testbed of encoded documents from various historical periods, geographical regions, and notational styles;
- a Web-accessible database to permanently house the encoded documents and related information, which scholars can access or supplement;
- assisting my Oxford University collaborators in the implementation of their algorithms for musicological analysis of data in this representation; and
- investigation into optical character recognition (OCR) for automatic data entry from photographic images, with output to this data representation.
The lossless data representation, together with the toolkit of applications programmes, will give scholars a standardized means for sharing data and performing analysis. Standardization will help to reduce fragmentation of digital libraries and duplication of effort. I shall seek international criticism of the specifications for the data representation and work closely on it with a research group in the University of Oxford headed by Dr John Caldwell of the Faculty of Music.
As a byproduct of this research, I hope to formulate a theoretical framework for digital encoding other types of archaic script, such as Byzantine ecphonic notation and Hebrew cheironomic notation.
|
|