The Leipzig Glossing Rules
pyigt supports the notation for morpheme/gloss structure proposed by the Leipzig Glossing Rules.
According to LGR Rule 1, object language and gloss lines have to be word-aligned. Such aligned
pairs of a word and a corresponding gloss are modeled via the GlossedWord class.
If an IGT conforms to Rule 2, glossed words are lists of aligned
GlossedMorpheme pairs.
The provisions of Rule 4 (and following), i.e. the structure of morpheme glosses, is implemented
as subclasses of GlossElement.
- class pyigt.lgrmorphemes.GlossElement(_)[source]
Rule 4. Gloss elements are separated by “.”.
- Variables:
start – Specifies the separator to use when combining a GlossElement with another.
- class pyigt.lgrmorphemes.GlossElementAfterSemicolon(_)[source]
Rule 4B. Distinct gloss elements can be separated by “;”.
- class pyigt.lgrmorphemes.GlossElementAfterColon(_)[source]
Rule 4C. Gloss element corresponding to “hidden” object language elements are separated by “:”.
- class pyigt.lgrmorphemes.GlossElementAfterBackslash(_)[source]
Rule 4D. Morphophonological change is marked with a leading “".
- class pyigt.lgrmorphemes.PatientlikeArgument(_)[source]
Rule 4E. Patient-like arguments are marked with a leading “>”.
Note: Infer the agent-like argument by looking up the prev property.
- class pyigt.lgrmorphemes.NonovertElement(_)[source]
Rule 6. Non-overt elements can be enclosed in square brackets.
- class pyigt.lgrmorphemes.InherentCategory(_)[source]
Rule 7. Inherent categories can be enclosed in round brackets.
- class pyigt.lgrmorphemes.Morpheme(_)[source]
Rule 2. Morphemes are separated by “-“.
- property elements: list[pyigt.lgrmorphemes.GlossElement]
>>> m = Morpheme('a<b>c') >>> m.elements [<GlossElement "a">, <Infix "b">, <GlossElement "c">]
- property form_and_infixes: tuple[str, list[str]]
>>> m = Morpheme('a<b>c') >>> m.form_and_infixes ('ac', ['b'])
- pyigt.lgrmorphemes.split_morphemes(s)[source]
Split string into morphemes.
- Parameters:
s (
str) –- Return type:
list[str]
- pyigt.lgrmorphemes.remove_morpheme_separators(s)[source]
Remove all characters listed as morpheme separators from string.
- Parameters:
s (
str) –- Return type:
str
- class pyigt.lgrmorphemes.GlossedWord(word, gloss, glossed_morphemes=<factory>, strict=False)[source]
A (word, gloss) pair, corresponding to two aligned items from IGT according to LGR.
Provides list-like access to its
GlossedMorphemes.- Parameters:
word (
str) –gloss (
str) –glossed_morphemes (
list[pyigt.lgrmorphemes.GlossedMorpheme]) –strict (
bool) –
- property form: str
Removes sentence-level markup and morpheme separators from .word.
>>> from pyigt.lgrmorphemes import GlossedWord >>> gw = GlossedWord(word='"An-fangs', gloss="a-b") >>> gw.form 'Anfangs'
- property word_from_morphemes: str
>>> gw = GlossedWord('a-word', 'a.DU-gloss') >>> gw.word_from_morphemes 'a-word'
- property gloss_from_morphemes: str
>>> gw = GlossedWord('a-word', 'a.DU-gloss') >>> gw.gloss_from_morphemes 'a.DU-gloss'
- class pyigt.lgrmorphemes.GlossedMorpheme(morpheme, gloss, sep, prev=None, next=None)[source]
A (morpheme, gloss) pair.
- Variables:
morpheme – The morpheme form.
gloss – The literal gloss.
sep – The morpheme separator preceding this morpheme.
prev – Points to the previous GlossedMorpheme in a word, or None.
next – Points to the next GlossedMorpheme in a word, or None.
- Parameters:
morpheme (
pyigt.lgrmorphemes.Morpheme) –gloss (
pyigt.lgrmorphemes.Morpheme) –sep (
str) –prev (
typing.Optional[pyigt.lgrmorphemes.GlossedMorpheme]) –next (
typing.Optional[pyigt.lgrmorphemes.GlossedMorpheme]) –
- property form: str
Removes sentence-level markup (i.e. punctuation etc.) from .morpheme.
>>> from pyigt.lgrmorphemes import GlossedMorpheme >>> gm = GlossedMorpheme(morpheme='"[ab.c', gloss="abc", sep='-') >>> gm.form 'abc'
- property first: bool
Whether the morpheme is the first in the word.
- property last: bool
Whether the morpheme is the last in the word.
- property grammatical_concepts: list[str]
Grammatical concepts, referenced with category labels according to Rule 3, used in morpheme gloss.
Note
Gloss element separators according to Rule 4B and 4C are interpreted as signaling a separate concept.
>>> from pyigt.lgrmorphemes import GlossedMorpheme >>> gm = GlossedMorpheme(morpheme='abc', gloss='ABC.DEF:GHI;JKL', sep='.') >>> gm.grammatical_concepts ['ABC.DEF', 'GHI', 'JKL']
- property lexical_concepts: list[str]
Gloss elements not recognized as category labels are interpreted as lexical concepts.
>>> from pyigt.lgrmorphemes import GlossedMorpheme >>> gm = GlossedMorpheme(morpheme='çık', gloss='come_out', sep='.') >>> gm.lexical_concepts ['come out']