The INLET corpus system

INLET is our environment for empirical linguistic research. It is a wordplay on the name of the British linguist John Rupert Firth, who first discussed the relevance of co-occurrence frequencies of words and can thus be regarded as an intellectual grandparent of corpus linguistics – but it could also be an abbreviation for for INtroducing Linguistics Empirically and Theoretically or INtegrated Linguistics Environment for Teaching.

Currently, INLET is focused exclusively on corpus linguistics. At its heart is our collection of corpora and the software used to access them, the Corpus Workbench (CWB) with the Corpus Query Processor (CQP). We have added some software of our own to make working with the output of CQP more comfortable and to provide some additional functions such as producing collocate lists and tables.

INLET also provides access to the Tree Tagger, a program that automatically adds part-of-speech tagging and lemmatization to text in a range of languages.

The TreeTagger and the Corpus Workbench, complemented by some tools that we have created, can be used to create your own corpora in such a way that you can work with them using CQP.

To use INLET, you first need to set up your ZEDAT account accordingly. To create and use your own corpora, additional steps are necessary to get access to the Tree Tagger and the additional programs we provide.