Collections of IGT
- class pyigt.Corpus(igts, fname=None, clean_lexical_concept=None)[source]
A Corpus is an immutable, ordered list of IGT instances.
It provides access to concordance-like aggregated statistics of its texts.
- Variables:
monolingual – Flag signaling whether the corpus is monolingual or contains IGT from different object languages.
- Parameters:
igts (
typing.Iterable
[pyigt.igt.IGT
]) –
- property grammar: Dict[str, List[Tuple[int, int, int]]]
Maps grammatical concepts to lists of occurrences.
>>> from pyigt import Corpus, IGT >>> igt = IGT(phrase="ni-c-chihui-lia in no-piltzin ce calli", ... gloss="1SG.SUBJ-3SG.OBJ-mach-APPL DET 1SG.POSS-Sohn ein Haus") >>> c = Corpus([igt]) >>> [[c[ref] for ref in c.grammar[k]] for k in c.grammar if k.startswith('1SG')] [[<GlossedMorpheme morpheme=ni gloss=1SG.SUBJ>], [<GlossedMorpheme morpheme=no gloss=1SG.POSS>]]
- property lexicon: Dict[str, List[Tuple[int, int, int]]]
Maps lexical concepts to lists of occurrences.
>>> from pyigt import Corpus, IGT >>> igt = IGT(phrase="ni-c-chihui-lia in no-piltzin ce calli", ... gloss="1SG.SUBJ-3SG.OBJ-mach-APPL DET 1SG.POSS-Sohn ein Haus") >>> c = Corpus([igt]) >>> [c[ref] for ref in c.lexicon['Sohn']] [<GlossedMorpheme morpheme=piltzin gloss=Sohn>]
- property form: Dict[str, List[Tuple[int, int, int]]]
Maps grammatical concepts to lists of occurrences.
>>> from pyigt import Corpus, IGT >>> igt = IGT(phrase="ni-c-chihui-lia in no-piltzin ce calli", ... gloss="1SG.SUBJ-3SG.OBJ-mach-APPL DET 1SG.POSS-Sohn ein Haus") >>> c = Corpus([igt]) >>> [k for k in c.form] ['ni', 'c', 'chihui', 'lia', 'in', 'no', 'piltzin', 'ce', 'calli']
- classmethod from_cldf(cldf)[source]
Instantiate a corpus of IGT examples from a CLDF dataset.
- Parameters:
cldf (
pycldf.dataset.Dataset
) – a pycldf.Dataset instance.spec – a CorpusSpec instance, specifying how to interpret markup in the corpus.
- Return type:
- classmethod from_path(path)[source]
Instantiate a corpus from a file path.
- Parameters:
path (
typing.Union
[str
,pathlib.Path
]) – Either a path to a CLDF dataset’s metadata file or to a CLDF Examples component as CSV file. Note that in the latter case, the file must use the default column names, as defined in the CLDF ontology.- Return type:
- write_concordance(ctype, filename=None)[source]
- Parameters:
ctype (
str
) – lexicon or grammar or form.