The NEUMES Project    Abstract

Qualities of NEUMES/NeumesXML

This article gives a concise overview of the nature and purpose of the NEUMES data representation and the NeumesXML markup language. These software products are available for non-commercial, educational, and cultural use under a free license.1 Currently, the data representation and markup language are in the 'beta-test' stage of development. This means that product design is largely complete, and the products are in a functioning state. The products are not yet, however, ready for 'production use' due to on-going testing and refinement of the design. The finalization of the design, such that NEUMES/NeumesXML chant transcriptions will be stable for the long term, is still some months away. Users are invited to experiment with, and learn about this software now; users can make sample transcriptions, with the understanding that such transcriptions might need to be modified in the future.
cross-referenceSee also, the Glossary of Terms of the NEUMES Project.

1 Certain licensing restrictions apply, as stated at the top of the NeumesXML Schema document. Use of the software products in any way that that does not comply with the licensing restrinctions shall constitute an infringement of copyright.

I.  The NEUMES data representation

NEUMES is a data-coding standard for transcribing neumatic chant notation to computer data, especially the content of medieval Western (Latin) and Eastern (Byzantine, principally Greek) liturgical manuscripts.

In medieval manuscripts containing chant, scribes used various emergent symbol systems for music notation (that is to say, musical notations that were somewhat primitive, experimental, and evolving). Music notation was written usually above the sacred text to be cantillated, chanted, or sung. In subsequent centuries, the many early symbol systems coalesced into two main types of notation. These historically later and stable branches of notation are: (a) square-neume notation on a four-line staff (which is commonly called so-called 'Gregorian chant' notation; one sees it today in mechanically-printed books of Roman Catholic chant, such as the Liber Usualis); and (b) the so-called 'New Method' of notation for Eastern Orthodox chant and hymns. The principal benefit of NEUMES data representation for digital transcription of medieval neumed sources is that it allows encoding of archaic neume notations without forcing anachronistic concepts or unwarranted assumptions about the original manuscript notation. Systematic comparison of melodies is possible across all notation types. Information about the glyphs in the native notation is preserved in NEUMES transcription. For purposes of teaching, journal articles, choir sheets, and so on, a "diplomatic facsimile" of a manuscript chant can be generated from NEUMES data to a stylized version of the origiinal notation.

The main design requirements of the NEUMES data representation are as follows.
  • It must be "lossless", by which we mean that all chant-semantic content on the face of neumed artifacts can be expressed in the data.
  • The dozens of early notational styles (as well as modern chant-notation) are represented by a unified (or, universal) abstraction, such that, although notational idiosyncrasies of primary sources are preserved in transcription data, it is feasible and computationally-efficient to search and compare melodic patterns across all notational species (even across Western and Eastern sources).
  • The raw data are inherently compatible with XML (Extensible Markup Language) and are entirely suited for Internet transmission (viz, without need for "serialization" of data).
  • Computer programs that will read NEUMES data for a variety of end-uses (such as Web browser visualization, word processing, typeset printing, musicological analysis, and so on) are well supported by the data format -- this avoids duplication of effort in creating separate transcriptions for different end-uses.
  • It shall allow scalar Certainty Factors to be part of the data, by which transcribers can qualify their reading of a source artifact at specific points where graphical features might be unclear or the scribe's intended usage of symbols is not known with confidence.
The NEUMES data representation comprises two main parts, as follows.
  1. The NEUMES Taxonomy assigns a codepoint (ie, a numerical value) in the Private Use Area of the Unicode™ to each distinct semantic item that may appear on the face of chant manuscripts, and designates a human-readable name (or, mnemonic) to each codepoint.

  2. The NEUMES Grammar decides the 'well-formedness' of transcription data in order to sustain the design requirements of interoperability and permanence of transcription data (viz., to avoid fragmentation of digital libraries).

II.  The NeumesXML markup language

NeumesXML is an implementation of XML Schema. The latter is the industry-standard extension of XML DTD (Document Type Definition) for implementing constraints on XML document content beyond the constraints on document structure provided by DTD. By enforcing some uniformity in how transcriptions are marked up, comprehensive searches across many NeumesXML documents for their meta-data content can be done by simple search queries.

Note that part of the Phase Two goals of the NEUMES Project is to design a system of parallel markup, by which end-users will be able to markup a transcription in user-defined ways without affecting the transcription document itself.

Since there already exists pervasive support for the XML standard for shared documents (for example, in Web browsers, Internet search engines, word processors, utilities that import/export XML documents to/from databases, and so forth), by making NeumesXML the "wrapper" that contains NEUMES transcription data, the capabilities of a great amount of existing software infrastructure can be leveraged.

The NeumesXML Schema is the framework in which a variety of transcription needs are met, including the following:
  • with the NeumesXML tag set, the transcriber describes the structure of the chant text, the music, and the neume notation;
  • the liturgical purpose, feast day, and other generic information about the chant is documented;
  • physical features of the source artifact may be recorded, such as pagination, line breaks, pin holes, erasures, and the like;
  • bibliographical, contextual, and accession information about the artifact, plus hyperlinks to related resources, are accommodated;
  • a chronicle can be maintained about versions of the transcription, transcribers, and editorial procedures that were used;
  • transcribers can annotate NEUMES transcription data at specific locations by inserting by editorial comments.
Mapping from the mnemonics of the NEUMES Taxonomy to their corresponding numerical codepoints is handled automatically by any XML pre-processor. XML pre-processors are ubiquitous; for example, they are bundled with newer Web browsers. Furthermore, an XML 'validating parser' automatically enforces the NEUMES Grammar; this helps ensure that all NEUMES transcriptions adhere to a sharable, interoperable data format.

NEUMES transcription data (since they are XML-compatible) can be manipulated easily by Java™ programs, XML utilities, and other kinds of software, so that a wide variety of processing functions can be implemented. NEUMES data also may be translated directly to different formats (although the converse, obviously, might not be possible). An example of such translation is the XSLT (Extensible Stylesheet Language Transformations) creation of an HTML graphical display of NEUMES transcription data, one possible implementation of which is done by the NEUMES 'beta-test' visualization script.

Copyright © 2004-2008, Louis W. G. Barton.