att.referring
att.referring provides attributes for specifying the beginning and end of a linguistic or textual segment, by addressing the character offsets or by identifying the edge elements via their ID. [1.3.1 Attribute Classes] | |||||||||||||||||||||
Module | tei — The TEI Infrastructure | ||||||||||||||||||||
Members | span | ||||||||||||||||||||
Attributes | Attributes
| ||||||||||||||||||||
Schematron | <sch:rule context="*[local-name() = ('span')][not(@referringMode) and @from and @to]"> <sch:assert test="@from castable as xsd:anyURI">The default form of @from is a URI</sch:assert> <sch:assert test="@to castable as xsd:anyURI">The default form of @to is a URI</sch:assert> </sch:rule> | ||||||||||||||||||||
Schematron | <sch:rule context="*[@referringMode eq 'pointer']"> <sch:assert test="@from castable as xsd:anyURI">When @referringMode is 'pointer', @from must be a URI</sch:assert> <sch:assert test="@to castable as xsd:anyURI">When @referringMode is 'pointer', @to must be a URI</sch:assert> </sch:rule> <sch:rule context="*[@referringMode eq 'id']"> <sch:assert test="id(substring(@from,2))">When @referringMode is 'id', @from must be pointing at an existing local target</sch:assert> <sch:assert test="id(substring(@to,2))">When @referringMode is 'id', @to must be pointing at an existing local target</sch:assert> </sch:rule> <sch:rule context="*[@referringMode = ('character','icp','byte')]"> <sch:assert test="@from castable as xsd:int">When @referringMode is '<sch:value-of select="@referringMode"/>', @from must be a non-negative integer</sch:assert> <sch:assert test="@to castable as xsd:int">When @referringMode is '<sch:value-of select="@referringMode"/>', @to must be a non-negative integer</sch:assert> </sch:rule> | ||||||||||||||||||||
Schematron | <sch:rule context="*[local-name() = ('span')][@to or @from]"> <sch:report test="contains(normalize-space(@to),' ') or contains(normalize-space(@from),' ')">The attributes @to and @from on <sch:name/> may each contain only a single value</sch:report> </sch:rule> | ||||||||||||||||||||
Schematron | <sch:rule context="*[local-name() = ('span')][@to]"> <sch:report test="@to and not(@from)">If @to is supplied on <sch:name/>, @from must be supplied as well</sch:report> </sch:rule> | ||||||||||||||||||||
Example | The example below comes from a part of the CoMParS (Collection of Multi-lingual Parallel Sequences) project and presents a fragment of a monolingual subcorpus of German. The individual sequences (in this case, a sentence) are listed in the text part of the corpus, while the linguistic analysis is performed in the <standOff> part, which consists, among others, of segmentation information. CoMParS adheres to ISO LAF principles and uses inter-character points with the indexing starting at 0. <text xml:lang="de"> <body> <ab xml:id="deu-ab1" n="1">Ich habe mich im Winter in dir verliebt.</ab> </body> </text> <!-- 'I'c'h' 'h'a'b'e' 'm'i'c'h' 'i'm' 'W'i'n't'e'r' 'i'n' 'd'i'c'h' 'v'e'r'l'i'e'b't'.' 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 --> <standOff> <listAnnotation n="1" corresp="#deu-ab1" type="sequence"> <listAnnotation type="segmentation"> <seg from="0" to="3" xml:id="deu-ab1tok1">Ich</seg> <seg from="4" to="8" xml:id="deu-ab1tok2">habe</seg> <seg from="9" to="13" xml:id="deu-ab1tok3">mich</seg> <seg from="14" to="16" xml:id="deu-ab1tok4">im</seg> <seg from="17" to="23" xml:id="deu-ab1tok5">Winter</seg> <seg from="24" to="26" xml:id="deu-ab1tok6">in</seg> <seg from="27" to="31" xml:id="deu-ab1tok7">dich</seg> <seg from="32" to="40" xml:id="deu-ab1tok8">verliebt</seg> <seg from="40" to="41" xml:id="deu-ab1tok9">.</seg> </listAnnotation> </listAnnotation> </standOff> Segmentation information gathered above is subsequently used by all other (numerous) annotation layers. The CoMParS ODD contains the following statements that (a) include seg into the att.referring class, and (b) change the default value of the referringMode to icp. The first part will hopefully be accepted by the Council together with this ticket or afterwards, while the second part is forced by the lack of a mechanism that would make attribute classes aware of the elements that belong to them. This mechanism is on the Council's to-do list. <elementSpec ident="seg" module="linking" mode="change"> <classes mode="change"> <memberOf key="att.referring"/> </classes> </elementSpec> <classSpec ident="att.referring" mode="change" type="atts" module="tei"> <constraintSpec scheme="schematron" ident="default_mode" mode="replace"> <constraint> <sch:rule context="*[local-name() = ('span','seg')][not(@referringMode) and @from and @to]"> <sch:assert test="@from castable as xsd:nonNegativeInteger">The default form of @from is a non-negative integer</sch:assert> <sch:assert test="@to castable as xsd:nonNegativeInteger">The default form of @to is a non-negative integer</sch:assert> </sch:rule> </constraint> </constraintSpec> <attList> <attDef ident="referringMode" usage="opt" mode="change"> <defaultVal>icp</defaultVal> </attDef> </attList> </classSpec> | ||||||||||||||||||||
Note | When referringMode assumes numeric values, two theoretical options are possible for the start index. Some systems assume that indexing starts with 0, some assume that the initial index value is 1. This decision is not reflected by referringMode but should be documented in the header, together with other project-specific encoding decisions. Linguistic analysis in the ISO LAF (Linguistic Annotation Framework, ISO 24612:2012) assumes inter-character points (represented here by the value icp) and indices starting at 0. |