Tutorial

The pyigt package provides an API to access interlinear glossed text from Python code.

Glossed phrases

In the simplest case, interlinear glossed text is provided as phrase-chunked pairs of object language and gloss lines - an instance of pyigt.IGT:

>>> from pyigt import IGT
>>> igt = IGT(phrase="ni-c-chihui-lia in no-piltzin ce calli", gloss="1SG.SUBJ-3SG.OBJ-mach-APPL DET 1SG.POSS-Sohn ein Haus")
>>> print(igt)
nicchihuilia in nopiltzin ce calli
ni-c-chihui-lia             in    no-piltzin     ce    calli
1SG.SUBJ-3SG.OBJ-mach-APPL  DET   1SG.POSS-Sohn  ein   Haus

Such a chunk consists of aligned, glossed words (conventionally separated by whitespace):

>>> for word in igt:
...     print(word)
...
<GlossedWord word=ni-c-chihui-lia gloss=1SG.SUBJ-3SG.OBJ-mach-APPL>
<GlossedWord word=in gloss=DET>
<GlossedWord word=no-piltzin gloss=1SG.POSS-Sohn>
<GlossedWord word=ce gloss=ein>
<GlossedWord word=calli gloss=Haus>

Zooming in: Morphemes and gloss elements

The words (and glosses) are segmented into glossed morphemes (GlossedMorpheme)

>>> igt[0, 0:]
[<GlossedMorpheme morpheme=ni gloss=1SG.SUBJ>, <GlossedMorpheme morpheme=c gloss=3SG.OBJ>, <GlossedMorpheme morpheme=chihui gloss=mach>, <GlossedMorpheme morpheme=lia gloss=APPL>]
>>> igt[0, 0].grammatical_concepts
['1SG.SUBJ']
>>> igt[2, 1].lexical_concepts
['Sohn']
>>> igt[0, 0].gloss.elements
[<GlossElement "1SG">, <GlossElement "SUBJ">]

Zooming out: Collections of IGT - a corpus

Collections of IGTs form a pyigt.Corpus

>>> from pyigt import Corpus
>>> c = Corpus([igt, igt])
>>> c[0, 0, 0]
<GlossedMorpheme morpheme=ni gloss=1SG.SUBJ>
>>> [c[ref] for ref in c.grammar['APPL']]
[<GlossedMorpheme morpheme=lia gloss=APPL>, <GlossedMorpheme morpheme=lia gloss=APPL>]