Sourcecode Listing of

http://purl.oclc.org/NEUMES/ref/NEUMES_grammar.pen





Color Key :   [•] XML code      [•] XML code annotation      [•] XML comment   

Line
0001 <?xml version="1.0" encoding="UTF-8"?>
0002 <!--
0003 *   NEUMES grammar file (Entities of Regular Expressions) generated on 24 May 2006.
0004 *   Filename: http://www.scribeserver.com/NEUMES/xml/NEUMES_grammar.pen
0005 *   Version: 2.3.c, beta-test.
0006 *   Owner: The NEUMES Project
0007 *   (Neumed and Ecphonetic Universal Manuscript Encoding Standard).
0008 *   For details see, http://purl.oclc.org/SCRIBE/NEUMES/
0009 *   Author: Louis W. G. Barton.
0010 *   Type: XML Entity definitions.
0011 -->
0012 
0013 <!--   NEUMES grammar Entity set
0014    This file defines the NEUMES character Entity set for digital transcription
0015    of medieval chant manuscripts written in neume notation.
0016    The 'NEUMES' and 'RUBRICS' Entities are defined here as a set of expressions
0017    implementing the NEUMES regular grammar for the NEUMES character Entity set.
0018    This grammar is used  for constraining strings of NEUMES characters, which are
0019    used for digital transcription of medieval chant manuscript sources that were
0020    written in neume notation. These two Entities are used principally in the
0021    NeumesXML Schema, against which NeumesXML instance documents can be schema-
0022    validated. PEN (Parameter Entity) files, such as this one, are used primarily
0023    in XML (Extensible Markup Language) applications. Typical invocation:
0024    <!ENTITY % NEUMES_CFG SYSTEM
0025      "http://www.scribeserver.com/NEUMES/xml/NEUMES_grammar.pen">
0026    %NEUMES_CFG;
0027 
0028    Protected by law under one or more of the following copyrights:
0029    Copyright 2005, The University of Oxford.
0030    Copyright 2003-2005, Louis W. G. Barton.
0031    Copyright 2002-2003, The President and Fellows of Harvard College; contains
0032    software or other intellectual property licensed from Louis W. G. Barton,
0033    copyright 1995-2001 by Louis W. G. Barton.
0034 
0035    The copyright holders grant royalty-free license to transmit, display,
0036    perform and/or distribute without modification the NeumeXML version 2
0037    Schema and its accompanying documentation for non-commercial educational,
0038    cultural, and charitable uses, provided that the above copyright notice
0039    and this paragraph appear in all copies. The copyright holders make no
0040    representation about the suitability of the Schema and its accompanying
0041    documentation for any purpose. It is provided "as is" without expressed
0042    or implied warranty.
0043 
0044    All occurrences of the word "Unicode" herein should be understood as
0045    "Unicode[TM]," a trademark of Unicode, Inc.
0046 -->
0047 
0048 
0049 <!-- Programmer's notes:
0050 * Cf., the Context-Free Grammar of the _Interim Project Report_.
0051 * This file depends on the file "NEUMES_characters.pen" for the Entity literals.
0052 * Terminal Entities must be declared first to avoid forward references, and Non-terminal
0053   Entities must be declared in reverse referential order.
0054 * The Entity literals contain XML Schema regular expression operators.
0055 * Character selectors "[ ]" by default (not followed by "{}") select just one character.
0056 * Do not insert blanks into regular expressions.
0057 * Rule: fully parenthesize expressions.
0058 * Operator Precedence:
0059   1. repetition ('*' or '+')
0060   2. concatenation (simply placing one item after another)
0061   3. alternation ('|')
0062 * General Entities (GE) are used instead of Parameter Entities. (Usually parameter
0063   entities are used to hold markup declarations, content models, or parts of attribute
0064   lists.) Each GE defined here (except "NEUMES" and "RUBRICS") begins with 'cfg_' to
0065   avoid potential naming conflicts with XML files that use this grammar.
0066 * Character ranges are used instead of 'Long strings', as numerical calculation is faster.
0067 * To do: expand composed_neume_form.
0068 * To do: change *text_character* to any Unicode text char.
0069 * To do: implement RUBRICS in NeumesXML.
0070 * Q: Allow CFs on Composition characters?
0071 * Q: Add qualifiers to *rubrical_symbol* (eg, for italic pitch letters)?
0072 * Q: Insert here a split between 1-, 2-, 3-, and 3+ - tone neume forms?
0073 If so, then child branches would also need to be ramified.
0074 ENTITY neumatic_symbol   "&neumatic_symbol_1_tone; |
0075                &neumatic_symbol_2_tone; |
0076                &neumatic_symbol_3_tone; |
0077                &neumatic_symbol_3plus_tone;"
0078 ENTITY neumatic_symbol_1_tone   "&neume_form_1_tone;?&neume_descriptor_list_1_tone;"
0079 * Reference from NeumesXML_main.xsd:
0080    <xsd:simpleType name="neume">
0081       <xsd:restriction base="xsd:string">
0082          <xsd:pattern value="&NEUMES;"/>
0083       </xsd:restriction>
0084    </xsd:simpleType>
0085 
0086 [End, Programmer's Notes] -->
0087 
0088 
0089 <!-- ****** Symbolic Name Substitutions: NEUMES Codepoint Declarations: ****** -->
0090 <!ENTITY % NEUMES SYSTEM "http://www.scribeserver.com/NEUMES/xml/NEUMES_characters.pen">
0091 %NEUMES;
0092 
0093 
0094 <!-- *************** CHARACTER RANGES: *************** -->
0095 
0096 <!--   A *certainty_factor* is any one character in the range of Certainty Factors.
0097 -->
0098    <!ENTITY cfg_certainty_factor
0099    "[&CF_p10;-&CF_n10;]" >
0100 
0101 
0102 <!--   An *tonal_movement* is a single character in the range of Tonal Movements.
0103 -->
0104    <!ENTITY cfg_tonal_movement
0105    "[&no_tone;-&dn_undiff9;]" >
0106 
0107 
0108 <!--   A *pitch* is a single character in the range of Pitches.
0109 -->
0110    <!ENTITY cfg_pitch
0111    "[&ton_ut;-&ton_gg;]" >
0112 
0113 
0114 <!--   A *qualifier_west* is a single character in the range of Western Qualifiers.
0115 -->
0116    <!ENTITY cfg_qualifier_west
0117    "[&liquescent;-&long_stroke;&equaliter;-&perf_t_elongated;]" >
0118 
0119 
0120 <!--   A *qualifier_east* is a single character in the range of Eastern Qualifiers.
0121 -->
0122    <!ENTITY cfg_qualifier_east
0123    "[&haple;-&piasma;]" >
0124 
0125 
0126 <!--   A *local_color* is a single character in the range of Local Ink Colors.
0127 -->
0128    <!ENTITY cfg_local_color
0129    "[&local_no_color;-&local_color_yellow;]" >
0130 
0131 
0132 <!--   An *AlignmentSpecifiers* is a single character in the range of Alignment Specifiers.
0133 -->
0134    <!ENTITY cfg_alignment_specifier
0135    "[&position_0;-&position_20;]" >
0136 
0137 
0138 <!--   A *substitute_style* is a single character in the range of Substitute Styles.
0139 -->
0140    <!ENTITY cfg_substitute_style
0141    "[&subst2;-&subst6;]" >
0142 
0143 
0144 <!--   An *simple_neume_form_east* is any one character in the Byzantine Neume Form range.
0145 -->
0146    <!ENTITY cfg_simple_neume_form_east
0147    "[&begin_neume_forms_east;-&end_neume_forms_east;]" >
0148 
0149 
0150 <!--   An *simple_neume_form_west* is any one character in the Latin Neume Form range.
0151 -->
0152    <!ENTITY cfg_simple_neume_form_west
0153    "[&begin_neume_forms_west;-&end_neume_forms_west;]" >
0154 
0155 
0156 <!--   A *rubrical_character_east* is a single character in range of NEUMES rubrical
0157    symbols for Eastern sources.
0158 -->
0159    <!ENTITY cfg_rubrical_character_east
0160    "[&glyphs_no_color;-&glyphs_color_yellow;&barys;-&martyria_plagios_deuteros;]" >
0161 
0162 
0163 <!--   A *rubrical_character_west* is a single character in range of NEUMES rubrical
0164    symbols for Western sources.
0165 -->
0166    <!ENTITY cfg_rubrical_character_west
0167    "[&glyphs_no_color;-&glyphs_color_yellow;&antiphon;-&ut_supra;&equaliter;-&perf_t_elongated;]" >
0168 
0169 
0170 <!--   A *clef_character* is a single character in range of NEUMES clef signs.
0171 -->
0172    <!ENTITY cfg_clef_character
0173    "[&no_clef;-&f_clef;]" >
0174 
0175 
0176 <!--   A *clef_position_character* is a single character in range of NEUMES clef positions.
0177 -->
0178    <!ENTITY cfg_clef_position_character
0179    "[&line1;-&line6;]" >
0180 
0181 
0182 <!-- *************** GRAMMAR DECOMPOSITION: *************** -->
0183 
0184 <!--   A NEUME DESCRIPTOR LIST is a non-empty sequence, where each item in the sequence
0185    consists of: an optional *pitch* (followed optionally by a *certainty_factor*);
0186    followed by a required *tonal_movement*; followed by an optional *certainty_factor*;
0187    followed by an optional Ligation character.
0188 -->
0189    <!ENTITY cfg_neume_descriptor_list
0190    "((&cfg_pitch;&cfg_certainty_factor;?)?&cfg_tonal_movement;&cfg_certainty_factor;?[&LIG;]?)+" >
0191 
0192 
0193 <!--   A QUALIFIER LIST WEST is a non-empty sequence, where each item in the sequence consists
0194    of: an optional *substitute_style*; followed by an optional *local_color*; followed
0195    by an optional subsequence, where each item of the subsequence consists of a
0196    *qualifier_west*, followed by an optional *certainty_factor*.
0197 -->
0198    <!ENTITY cfg_qualifier_list_west
0199    "&cfg_substitute_style;?&cfg_local_color;?((&cfg_qualifier_west;&cfg_certainty_factor;?)+)?" >
0200 
0201 
0202 <!--   A QUALIFIER LIST EAST is a non-empty sequence, where each item in the sequence consists
0203    of: an optional *substitute_style*; followed by an optional *local_color*; followed
0204    by an optional subsequence, where each item of the subsequence consists of a
0205    *qualifier_east*, followed by an optional *certainty_factor*.
0206 -->
0207    <!ENTITY cfg_qualifier_list_east
0208    "&cfg_substitute_style;?&cfg_local_color;?((&cfg_qualifier_east;&cfg_certainty_factor;?)+)?" >
0209 
0210 
0211 <!--   A SIMPLE NEUME FORM is: a *simple_neume_form_east* or a *simple_neume_form_west*;
0212    followed by an optional *certainty_factor*; followed by an optional *qualifier_list*.
0213 -->
0214    <!ENTITY cfg_simple_neume_form
0215    "(&cfg_simple_neume_form_east;&cfg_certainty_factor;?(&cfg_qualifier_list_east;)?)|(&cfg_simple_neume_form_west;&cfg_certainty_factor;?(&cfg_qualifier_list_west;)?)" >
0216 
0217 
0218 <!--   A COMPOSITE NEUME FORM consists of: a Start Compose character;
0219    followed by a *simple_neume_form*; followed by a Subordinate character;
0220    followed by a *simple_neume_form*; followed by an End Compose character.
0221 -->
0222    <!ENTITY cfg_composite_neume_form
0223    "&STA_compose;(&cfg_simple_neume_form;)(&subordinate;(&cfg_simple_neume_form;))+&END_compose;" >
0224 
0225 
0226 <!--   A NEUMATIC SYMBOL consists of: an optional *composite_neume_form* or
0227    *simple_neume_form*; followed by a required *neume_descriptor_list*.
0228 -->
0229    <!ENTITY cfg_neumatic_symbol
0230 "((&cfg_composite_neume_form;)|(&cfg_simple_neume_form;))?(&cfg_neume_descriptor_list;)" >
0231 
0232 
0233 <!--   NEUMATIC SEQUENCE
0234    [blank]
0235 -->
0236 
0237 
0238 <!--   A RUBRICAL SYMBOL WEST is a sequence consisting of: a *rubrical_character_west*;
0239    followed by an optional *local_color*; followed by an optional *pitch*; followed by an
0240    optional *tonal_movement*; followed by an optional *certainty_factor*.
0241 -->
0242    <!ENTITY cfg_rubrical_symbol_west
0243    "&cfg_rubrical_character_west;&cfg_substitute_style;?&cfg_local_color;?&cfg_pitch;?&cfg_tonal_movement;?&cfg_certainty_factor;?" >
0244 
0245 
0246 <!--   A RUBRICAL SYMBOL EAST is a sequence consisting of: a *rubrical_character_east*;
0247    followed by an optional *substitute_style*; followed by an optional *local_color*.
0248 -->
0249    <!ENTITY cfg_rubrical_symbol_east
0250    "&cfg_rubrical_character_east;&cfg_substitute_style;?&cfg_local_color;?" >
0251 
0252 
0253 <!--   A CLEF SYMBOL is an *clef_character*, followed by an optional *clef_position_character*.
0254 -->
0255    <!ENTITY cfg_clef_symbol
0256    "&cfg_clef_character;&cfg_substitute_style;?&cfg_local_color;?&cfg_clef_position_character;?" >
0257 
0258 
0259 <!-- A COMMENT consists of a left-angle bracket, "comment", an ASCII string, then "/>".
0260 -->
0261    <!ENTITY cfg_comment
0262    "&lt;comment content=(&quot;[\p{IsBasicLatin}^&quot;]+&quot;)|(&apos;[\p{IsBasicLatin}^&apos;]+&apos;)/>" >
0263 
0264 
0265 <!-- A CLEF consists of an optional comment, followed by the STArt character,
0266 followed by one *cfg_clef_symbol*, followed by the END character.
0267 -->
0268    <!ENTITY cfg_clef
0269    "(&cfg_comment;)?&STA;&cfg_clef_symbol;&END;">
0270 
0271 
0272 <!-- A RUBRIC WEST consists of an optional comment, followed by the STArt character,
0273 followed by one or more *rubrical_symbol_west*s, followed by the END character.
0274 -->
0275    <!ENTITY cfg_rubric_west
0276    "(&cfg_comment;)?&STA;(&cfg_rubrical_symbol_west;)+&END;">
0277 
0278 
0279 <!-- A RUBRIC EAST consists of an optional comment, followed by the STArt character,
0280 followed by one or more *rubrical_symbol_east*s, followed by the END character.
0281 -->
0282    <!ENTITY cfg_rubric_east
0283    "(&cfg_comment;)?&STA;(&cfg_rubrical_symbol_east;)+&END;">
0284 
0285 
0286 <!--   A RUBRICAL TEXT is an optional *comment*, followed by a string of characters in the
0287    Unicode Basic Latin block; this pattern may be repeated.
0288 -->
0289    <!ENTITY cfg_rubrical_text
0290    "((&cfg_comment;)?[\p{IsBasicLatin}]+)+" >
0291 
0292 
0293 <!-- *************** THE RUBRICS ENTITY: *************** -->
0294 <!--   The RUBRICS Entity is a non-empty sequence of Unicode Private Use Area characters
0295    declared in the NEUMES data representation codespace (cf, NEUMES_characters.pen).
0296    Items in the sequence are all of one type, either: (a) one or more *rubric_east*;
0297    or (b) one or more *rubric_west*; or (c) exactly one *clef*.
0298 -->
0299    <!ENTITY RUBRICS
0300    "&cfg_rubrical_text;|&cfg_rubric_east;|&cfg_rubric_west;|&cfg_clef;" >
0301 
0302 
0303 <!-- *************** THE NEUMES ENTITY: *************** -->
0304 <!--   The NEUMES Entity is a non-empty sequence of Unicode Private Use Area characters
0305    declared in the NEUMES data representation codespace (cf, NEUMES_characters.pen).
0306    Each item in the sequence consists of: an optional *comment*; followed by the STArt
0307    character; followed by one or more *neumatic_symbol*; followed by the END character.
0308 -->
0309    <!ENTITY NEUMES
0310    "((&cfg_comment;)?&STA;(&cfg_neumatic_symbol;)+&END;)+" >
0311 
0312 <!-- END, NEUMES_grammar.pen -->
= END LISTING =