Table 2 Basic operations involving the lifting of data to the CLDF standards with the help of the PyLexibank package.

From: Lexibank, a public repository of standardized wordlists with computed phonological and lexical features

Procedure

Reference Catalog

Software

Description

link languages

Glottolog

PyGlottolog

Link the language names to the identifiers provided by the Glottolog reference catalog. Currently, this is done manually in most parts.

map concepts

Concepticon

PyConcepticon

Map elicitation glosses in the original wordlist data to the concept identifiers provided by the Concepticon reference catalog. Software for semi-automated concept mapping is used for this task and then manually refined.

unify transcriptions

CLTS

PyLexibank

LingPy

Segments

PyCLTS

Unify transcription systems by converting the transcriptions to the standards provided by the CLTS reference catalog. This procedure is by far the most complex one, which involves the cleaning of lexical forms, using dedicated routines in the PyLexibank package, the creation of a draft profile with the help of the LingPy package, the manual refinement of the profile and its application with the help of the Segments package, and finally its verification with the help of the PyCLTS package.