User Tools

Site Tools


cqp:list-of-coprora

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revisionPrevious revision
Next revision
Previous revision
cqp:list-of-coprora [2024/10/29 17:23] – add more corpora aamoakuhcqp:list-of-coprora [2025/02/05 16:51] (current) – [Corpus of English Dialogues 1560-1760] add different documentation link aamoakuh
Line 6: Line 6:
 If you don't have access to CQP just yet, check out the [[inlet:setup|INLET site]] to install the system on your account.  If you don't have access to CQP just yet, check out the [[inlet:setup|INLET site]] to install the system on your account. 
  
-For more information on the [[inlet:overview|INLET system]], visit this site.+For more information on the INLET system, visit [[inlet:overview|this site]].
  
 For more detailed information on each of these corpora, select the corpus on CQP, type ''info'' and press ''ENTER'' For more detailed information on each of these corpora, select the corpus on CQP, type ''info'' and press ''ENTER''
Line 21: Line 21:
 **Text publication dates:** 1960-1993 (split up into 3 periods) **Text publication dates:** 1960-1993 (split up into 3 periods)
  
-**Tagset:** CLAWS-5+**Tagset:** [[corpora:tagset-claws5|CLAWS-5]]
  
 **Cite as:**  **Cite as:** 
Line 36: Line 36:
 **Size:** 4,644,834 tokens **Size:** 4,644,834 tokens
  
-**Tagset:** CLAWS-5+**Tagset:** [[corpora:tagset-claws5|CLAWS-5]]
  
 **Corpus documentation:** http://www.natcorp.ox.ac.uk/corpus/baby/manual.pdf **Corpus documentation:** http://www.natcorp.ox.ac.uk/corpus/baby/manual.pdf
 +
 +
 +===== BNC2014-S =====
 +
 +==== Spoken British National Corpus 2014 ====
 +
 +**Size:** 1,1422,615 tokens
 +
 +**Text publication dates**: 2012-2016
 +
 +**Tagset:** [[https://ucrel.lancs.ac.uk/claws6tags.html|CLAWS-6]]
 +
 +**Corpus documentation:** http://corpora.lancs.ac.uk/bnc2014/documentation.php
 +
 +**Cite as**:  Love, Robbie, Claire Dembry, Andrew Hardie, Vaclav Brezina & Tony McEnery. 2017. The Spoken BNC2014: Designing and building a spoken corpus of everyday conversations. International Journal of Corpus Linguistics 22(3). 319–344. https://doi.org/10.1075/ijcl.22.3.02lov.
  
  
Line 49: Line 64:
 **Text publication dates**: 1710-1920 (split up into 3 periods) **Text publication dates**: 1710-1920 (split up into 3 periods)
  
-**Tagset**: PENN Corpora+**Tagset**: [[corpora:tagset-penn|PENN Corpora]]
  
 **Corpus documentation**: https://perswww.kuleuven.be/~u0044428/clmet3_0.htm **Corpus documentation**: https://perswww.kuleuven.be/~u0044428/clmet3_0.htm
Line 131: Line 146:
 **Text publication dates:** 1990-2012 **Text publication dates:** 1990-2012
  
-**Tagset:** CLAWS-7 +**Tagset:** [[corpora:tagset-claws7-coxa|CLAWS-7]]
  
 **Corpus documentation:** http://corpus.byu.edu/coca **Corpus documentation:** http://corpus.byu.edu/coca
Line 145: Line 160:
 **Size:** 471,427,380 tokens (400m words) **Size:** 471,427,380 tokens (400m words)
  
-**Tagset:** CLAWS-7 +**Tagset:** [[corpora:tagset-claws7-coxa|CLAWS-7]]
  
 **Corpus documentation:** http://corpus.byu.edu/coha/ **Corpus documentation:** http://corpus.byu.edu/coha/
Line 163: Line 178:
 **Text publication dates:** 1150-1500 (split up into 9 periods)  **Text publication dates:** 1150-1500 (split up into 9 periods) 
  
-**Tagset:** Penn Corpora +**Tagset:** [[corpora:tagsets|PENN Corpora]]
  
-**Corpus documentation:** http://www.ling.upenn.edu/hist-corpora/PPCME2-RELEASE-3/+**Corpus documentation:** https://www.ling.upenn.edu/hist-corpora/PPCME2-RELEASE-4/index.html , https://github.com/beatrice57/ppche-2024/tree/main/PPCME2-RELEASE-5/docs
  
 **Cite as:** **Cite as:**
Line 179: Line 194:
 **Text publication dates:** 1500-1710  **Text publication dates:** 1500-1710 
  
-**Tagset:** Penn Corpora +**Tagset:** [[corpora:tagsets|PENN Corpora]] 
  
-**Corpus documentation:** http://www.ling.upenn.edu/hist-corpora/PPCEME-RELEASE-2/+**Corpus documentation:** https://github.com/beatrice57/ppche-2024/tree/main/PPCEME-RELEASE-4/docs , https://www.ling.upenn.edu/hist-corpora/PPCEME-RELEASE-3/index.html
  
 **Cite as:** **Cite as:**
Line 196: Line 211:
 **Text publication dates:** 1700-1914  **Text publication dates:** 1700-1914 
  
-**Tagset:** Penn Corpora+**Tagset:** [[corpora:tagsets|PENN Corpora]]
  
-**Corpus documentation:** http://www.ling.upenn.edu/hist-corpora/PPCMBE-RELEASE-1/+**Corpus documentation:** https://github.com/beatrice57/ppche-2024/tree/main/PPCMBE2-RELEASE-2/docs , https://www.ling.upenn.edu/hist-corpora/PPCMBE2-RELEASE-1/index.html
  
 **Cite as:** **Cite as:**
Line 214: Line 229:
 **Text publication dates:** 1350-1710 (split up into 5 periods)  **Text publication dates:** 1350-1710 (split up into 5 periods) 
  
-**Tagset:** Penn Corpora+**Tagset:** [[corpora:tagsets|PENN Corpora]]
  
 **Corpus documentation:** http://www-users.york.ac.uk/~lang22/PCEEC-manual/corpus_description/index.htm **Corpus documentation:** http://www-users.york.ac.uk/~lang22/PCEEC-manual/corpus_description/index.htm
Line 220: Line 235:
 **Cite as:** **Cite as:**
 Parsed Corpus of Early English Correspondence, tagged version. 2006. Annotated by Arja Nurmi, Ann Taylor, Anthony Warner, Susan Pintzuk, and Terttu Nevalainen. Compiled by the CEEC Project Team. York: University of York and Helsinki: University of Helsinki. Distributed through the Oxford Text Archive. Parsed Corpus of Early English Correspondence, tagged version. 2006. Annotated by Arja Nurmi, Ann Taylor, Anthony Warner, Susan Pintzuk, and Terttu Nevalainen. Compiled by the CEEC Project Team. York: University of York and Helsinki: University of Helsinki. Distributed through the Oxford Text Archive.
 +
 +
 +===== CED =====
 +
 +==== Corpus of English Dialogues 1560-1760 ====
 +
 +**Size:** 1,458,700 tokens
 +
 +**Text publication dates:** 1560-1760 (split up into 5 periods) 
 +
 +**Tagset**: untagged
 +
 +**Corpus documentation:** https://data.ldaca.edu.au/collection?id=arcp%3A%2F%2Fname%2Chdl10.26180~23961609&_crateId=arcp%3A%2F%2Fname%2Chdl10.26180~23961609
 +
 +**Cite as:** A Corpus of English Dialogues 1560—1760. 2006. Compiled under the supervision of Merja Kyto (Uppsala University) and Jonathan Culpeper (Lancaster University).
 +
 +
 +===== COOEE =====
 +
 +==== Corpus of Oz Early English ====
 +
 +**Size:** 2,243,235 tokens
 +
 +**Text publication dates:** 1788-1900
 +
 +**Tagset:** [[corpora:tagset-treetagger|TreeTagger]]
 +
 +**Corpus documentation:** https://varieng.helsinki.fi/CoRD/corpora/COOEE/index.html
 +
 +**Cite as:** Fritz, Clemens W. A. 2012. From English in Australia to Australian English: 1788-1900. Frankfurt am Main: Peter Lang.
 +
  
  
 **[ Introduction to CQP: [[cqp:corpus-structure|Section 1]] -- [[cqp:simple-queries|Section 2]] -- [[cqp:advanced-querying|Section 3]] -- [[cqp:beyond-queries|Section 4]] -- [[cqp:expert-tricks|Section 5]] -- [[cqp:exercises|Section 6]]  -- [[cqp:list-of-coprora|Section 7]] ]** **[ Introduction to CQP: [[cqp:corpus-structure|Section 1]] -- [[cqp:simple-queries|Section 2]] -- [[cqp:advanced-querying|Section 3]] -- [[cqp:beyond-queries|Section 4]] -- [[cqp:expert-tricks|Section 5]] -- [[cqp:exercises|Section 6]]  -- [[cqp:list-of-coprora|Section 7]] ]**
  
cqp/list-of-coprora.1730219031.txt.gz · Last modified: 2024/10/29 17:23 by aamoakuh

Donate Powered by PHP Valid HTML5 Valid CSS Driven by DokuWiki