cqp:list-of-coprora
Differences
This shows you the differences between two versions of the page.
Both sides previous revisionPrevious revisionNext revision | Previous revision | ||
cqp:list-of-coprora [2024/10/29 17:10] – [PENN-HELSINKI PARSED CORPUS OF EARLY MODERN ENGLISH] aamoakuh | cqp:list-of-coprora [2025/02/05 16:51] (current) – [Corpus of English Dialogues 1560-1760] add different documentation link aamoakuh | ||
---|---|---|---|
Line 6: | Line 6: | ||
If you don't have access to CQP just yet, check out the [[inlet: | If you don't have access to CQP just yet, check out the [[inlet: | ||
- | For more information on the [[inlet: | + | For more information on the INLET system, visit [[inlet: |
For more detailed information on each of these corpora, select the corpus on CQP, type '' | For more detailed information on each of these corpora, select the corpus on CQP, type '' | ||
Line 21: | Line 21: | ||
**Text publication dates:** 1960-1993 (split up into 3 periods) | **Text publication dates:** 1960-1993 (split up into 3 periods) | ||
- | **Tagset:** CLAWS-5 | + | **Tagset: |
**Cite as:** | **Cite as:** | ||
Line 36: | Line 36: | ||
**Size:** 4,644,834 tokens | **Size:** 4,644,834 tokens | ||
- | **Tagset:** CLAWS-5 | + | **Tagset: |
**Corpus documentation: | **Corpus documentation: | ||
+ | |||
+ | ===== BNC2014-S ===== | ||
+ | |||
+ | ==== Spoken British National Corpus 2014 ==== | ||
+ | |||
+ | **Size:** 1,1422,615 tokens | ||
+ | |||
+ | **Text publication dates**: 2012-2016 | ||
+ | |||
+ | **Tagset:** [[https:// | ||
+ | |||
+ | **Corpus documentation: | ||
+ | |||
+ | **Cite as**: Love, Robbie, Claire Dembry, Andrew Hardie, Vaclav Brezina & Tony McEnery. 2017. The Spoken BNC2014: Designing and building a spoken corpus of everyday conversations. International Journal of Corpus Linguistics 22(3). 319–344. https:// | ||
+ | |||
+ | |||
+ | ===== CLMET ===== | ||
+ | |||
+ | ==== CORPUS OF LATE MODERN ENGLISH TEXTS ==== | ||
+ | |||
+ | **Size**: 40,340,760 tokens | ||
+ | |||
+ | **Text publication dates**: 1710-1920 (split up into 3 periods) | ||
+ | |||
+ | **Tagset**: [[corpora: | ||
+ | |||
+ | **Corpus documentation**: | ||
+ | |||
+ | **Cite as**: De Smet, Hendrik, Susanne Flach, Jukka Tyrkkö & Hans-Jügen Diller. 2015. The Corpus of Late Modern English (CLMET), version 3.1: Improved tokenization and linguistic annotation. KU Leuven, FU Berlin, U Tampere, RU Bochum. | ||
+ | |||
+ | |||
+ | ===== BROWN-LEGACY | ||
+ | |||
+ | ==== The Standard Corpus of Present-Day Edited American English ==== | ||
+ | |||
+ | **Size**: 1,137,466 tokens (approx. 1m words) | ||
+ | |||
+ | **Text publication dates**: 1961 | ||
+ | |||
+ | **Corpus documentation**: | ||
+ | |||
+ | **Cite as**: A Standard Corpus of Present-Day Edited American English, for use with Digital Computers (Brown). 1964, 1971, 1979. Compiled by W. N. Francis and H. Kučera. Brown University. Providence, Rhode Island. | ||
+ | |||
+ | |||
+ | ===== FROWN-LEGACY ===== | ||
+ | |||
+ | ==== The Freiburg-Brown corpus of American English ==== | ||
+ | |||
+ | **Size**: 1,180,152 (approx. 1m words) | ||
+ | |||
+ | **Text publication dates**: 1992 | ||
+ | |||
+ | **Corpus documentation**: | ||
+ | |||
+ | **Cite as**: The Freiburg-Brown Corpus (‘Frown’) (POS-tagged version) compiled by Christian Mair, Albert Ludwigs-Universität Freiburg, and Geoffrey Leech, University of Lancaster | ||
+ | |||
+ | |||
+ | ===== LOB-LEGACY ===== | ||
+ | |||
+ | ==== The Lancaster-Oslo/ | ||
+ | |||
+ | **Size**: 1,157,496 tokens (approx. 1m words) | ||
+ | |||
+ | **Text publication dates**: 1961 | ||
+ | |||
+ | **Corpus documentation**: | ||
+ | |||
+ | **Cite as**: The LOB Corpus, POS-tagged version (1981–1986), | ||
+ | |||
+ | |||
+ | ===== FLOB-LEGACY ===== | ||
+ | |||
+ | ==== The Freiburg–LOB Corpus of British English ==== | ||
+ | |||
+ | **Size**: 1,165,747 tokens (approx. 1m words) | ||
+ | |||
+ | **Text publication dates**: 1991 | ||
+ | |||
+ | **Corpus documentation**: | ||
+ | |||
+ | **Cite as**: The Freiburg-LOB Corpus (‘F-LOB’) (POS-tagged version) compiled by Christian Mair, Albert Ludwigs-Universität Freiburg, and Geoffrey Leech, University of Lancaster | ||
Line 55: | Line 136: | ||
**Cite as:** | **Cite as:** | ||
Granger, Sylviane, Estelle Dagneaux & Fanny Meunier. 2002. // | Granger, Sylviane, Estelle Dagneaux & Fanny Meunier. 2002. // | ||
+ | |||
===== COCA-S | ===== COCA-S | ||
Line 64: | Line 146: | ||
**Text publication dates:** 1990-2012 | **Text publication dates:** 1990-2012 | ||
- | **Tagset:** CLAWS-7 | + | **Tagset: |
**Corpus documentation: | **Corpus documentation: | ||
Line 78: | Line 160: | ||
**Size:** 471,427,380 tokens (400m words) | **Size:** 471,427,380 tokens (400m words) | ||
- | **Tagset:** CLAWS-7 | + | **Tagset: |
**Corpus documentation: | **Corpus documentation: | ||
Line 96: | Line 178: | ||
**Text publication dates:** 1150-1500 (split up into 9 periods) | **Text publication dates:** 1150-1500 (split up into 9 periods) | ||
- | **Tagset: | + | **Tagset: |
- | **Corpus documentation: | + | **Corpus documentation: |
**Cite as:** | **Cite as:** | ||
Line 112: | Line 194: | ||
**Text publication dates:** 1500-1710 | **Text publication dates:** 1500-1710 | ||
- | **Tagset: | + | **Tagset: |
- | **Corpus documentation: | + | **Corpus documentation: |
**Cite as:** | **Cite as:** | ||
Line 129: | Line 211: | ||
**Text publication dates:** 1700-1914 | **Text publication dates:** 1700-1914 | ||
- | **Tagset: | + | **Tagset: |
- | **Corpus documentation: | + | **Corpus documentation: |
**Cite as:** | **Cite as:** | ||
Line 147: | Line 229: | ||
**Text publication dates:** 1350-1710 (split up into 5 periods) | **Text publication dates:** 1350-1710 (split up into 5 periods) | ||
- | **Tagset: | + | **Tagset: |
**Corpus documentation: | **Corpus documentation: | ||
Line 153: | Line 235: | ||
**Cite as:** | **Cite as:** | ||
Parsed Corpus of Early English Correspondence, | Parsed Corpus of Early English Correspondence, | ||
+ | |||
+ | |||
+ | ===== CED ===== | ||
+ | |||
+ | ==== Corpus of English Dialogues 1560-1760 ==== | ||
+ | |||
+ | **Size:** 1,458,700 tokens | ||
+ | |||
+ | **Text publication dates:** 1560-1760 (split up into 5 periods) | ||
+ | |||
+ | **Tagset**: untagged | ||
+ | |||
+ | **Corpus documentation: | ||
+ | |||
+ | **Cite as:** A Corpus of English Dialogues 1560—1760. 2006. Compiled under the supervision of Merja Kyto (Uppsala University) and Jonathan Culpeper (Lancaster University). | ||
+ | |||
+ | |||
+ | ===== COOEE ===== | ||
+ | |||
+ | ==== Corpus of Oz Early English ==== | ||
+ | |||
+ | **Size:** 2,243,235 tokens | ||
+ | |||
+ | **Text publication dates:** 1788-1900 | ||
+ | |||
+ | **Tagset:** [[corpora: | ||
+ | |||
+ | **Corpus documentation: | ||
+ | |||
+ | **Cite as:** Fritz, Clemens W. A. 2012. From English in Australia to Australian English: 1788-1900. Frankfurt am Main: Peter Lang. | ||
+ | |||
**[ Introduction to CQP: [[cqp: | **[ Introduction to CQP: [[cqp: | ||
cqp/list-of-coprora.1730218226.txt.gz · Last modified: 2024/10/29 17:10 by aamoakuh