<?xml version="1.0" encoding="UTF-8"?>
<!-- generator="FeedCreator 1.8" -->
<?xml-stylesheet href="https://userpage.fu-berlin.de/~structeng/wiki/lib/exe/css.php?s=feed" type="text/css"?>
<rdf:RDF
    xmlns="http://purl.org/rss/1.0/"
    xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
    xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
    xmlns:dc="http://purl.org/dc/elements/1.1/">
    <channel rdf:about="https://userpage.fu-berlin.de/~structeng/wiki/feed.php">
        <title>Structure of English: Linguistic Resources - corpora</title>
        <description></description>
        <link>https://userpage.fu-berlin.de/~structeng/wiki/</link>
        <image rdf:resource="https://userpage.fu-berlin.de/~structeng/wiki/lib/exe/fetch.php?media=wiki:logo.png" />
       <dc:date>2026-05-06T06:58:58+00:00</dc:date>
        <items>
            <rdf:Seq>
                <rdf:li rdf:resource="https://userpage.fu-berlin.de/~structeng/wiki/doku.php?id=corpora:historical&amp;rev=1718884381&amp;do=diff"/>
                <rdf:li rdf:resource="https://userpage.fu-berlin.de/~structeng/wiki/doku.php?id=corpora:tagset-claws5&amp;rev=1747221739&amp;do=diff"/>
                <rdf:li rdf:resource="https://userpage.fu-berlin.de/~structeng/wiki/doku.php?id=corpora:tagset-claws7-coxa&amp;rev=1718884382&amp;do=diff"/>
                <rdf:li rdf:resource="https://userpage.fu-berlin.de/~structeng/wiki/doku.php?id=corpora:tagset-claws7&amp;rev=1718884382&amp;do=diff"/>
                <rdf:li rdf:resource="https://userpage.fu-berlin.de/~structeng/wiki/doku.php?id=corpora:tagset-penn-historical&amp;rev=1718884382&amp;do=diff"/>
                <rdf:li rdf:resource="https://userpage.fu-berlin.de/~structeng/wiki/doku.php?id=corpora:tagset-penn&amp;rev=1718884382&amp;do=diff"/>
                <rdf:li rdf:resource="https://userpage.fu-berlin.de/~structeng/wiki/doku.php?id=corpora:tagset-stts-original&amp;rev=1718884382&amp;do=diff"/>
                <rdf:li rdf:resource="https://userpage.fu-berlin.de/~structeng/wiki/doku.php?id=corpora:tagset-treetagger&amp;rev=1718884382&amp;do=diff"/>
                <rdf:li rdf:resource="https://userpage.fu-berlin.de/~structeng/wiki/doku.php?id=corpora:tagsets&amp;rev=1718884382&amp;do=diff"/>
            </rdf:Seq>
        </items>
    </channel>
    <image rdf:about="https://userpage.fu-berlin.de/~structeng/wiki/lib/exe/fetch.php?media=wiki:logo.png">
        <title>Structure of English: Linguistic Resources</title>
        <link>https://userpage.fu-berlin.de/~structeng/wiki/</link>
        <url>https://userpage.fu-berlin.de/~structeng/wiki/lib/exe/fetch.php?media=wiki:logo.png</url>
    </image>
    <item rdf:about="https://userpage.fu-berlin.de/~structeng/wiki/doku.php?id=corpora:historical&amp;rev=1718884381&amp;do=diff">
        <dc:format>text/html</dc:format>
        <dc:date>2024-06-20T11:53:01+00:00</dc:date>
        <dc:creator>Anonymous (anonymous@undisclosed.example.com)</dc:creator>
        <title>historical</title>
        <link>https://userpage.fu-berlin.de/~structeng/wiki/doku.php?id=corpora:historical&amp;rev=1718884381&amp;do=diff</link>
        <description>Using Corpora in Historical Linguistics

Available corpora

The Penn Corpora

Resources

	*   (created by Alhadji Jallow, Jan Reimer and Georg Hartisch in 2017, used with permission)
	*  
	*  

About

The Penn corpora are

	*  The PPEME2 (Kroch, Anthony &amp; Ann Taylor. 2000.</description>
    </item>
    <item rdf:about="https://userpage.fu-berlin.de/~structeng/wiki/doku.php?id=corpora:tagset-claws5&amp;rev=1747221739&amp;do=diff">
        <dc:format>text/html</dc:format>
        <dc:date>2025-05-14T11:22:19+00:00</dc:date>
        <dc:creator>Anonymous (anonymous@undisclosed.example.com)</dc:creator>
        <title>tagset-claws5</title>
        <link>https://userpage.fu-berlin.de/~structeng/wiki/doku.php?id=corpora:tagset-claws5&amp;rev=1747221739&amp;do=diff</link>
        <description>Tagsets: CLAWS 5

The CLAWS 5 tagset is known mainly for its use in the 1994 British National Corpus (BNC).
 Tag  Description  AJ0  Adjective, general or positive (e.g. good, old, beautiful)  AJC  Comparative adjective (e.g. better, older)  AJS  Superlative adjective (e.g.</description>
    </item>
    <item rdf:about="https://userpage.fu-berlin.de/~structeng/wiki/doku.php?id=corpora:tagset-claws7-coxa&amp;rev=1718884382&amp;do=diff">
        <dc:format>text/html</dc:format>
        <dc:date>2024-06-20T11:53:02+00:00</dc:date>
        <dc:creator>Anonymous (anonymous@undisclosed.example.com)</dc:creator>
        <title>tagset-claws7-coxa</title>
        <link>https://userpage.fu-berlin.de/~structeng/wiki/doku.php?id=corpora:tagset-claws7-coxa&amp;rev=1718884382&amp;do=diff</link>
        <description>Tagsets: CLAWS 7 (COCA/COHA)

The CLAWS 7 tagset as used in the Corpus of Historical American English (COCA) and Corpus of Contemporary American English (COCA) contains two additions and several systamtic errors to the standard CLAWS 7 tagset. In addition, all tags except for one are in lowercase! For the standard CLAWS 7 tagset, see the page</description>
    </item>
    <item rdf:about="https://userpage.fu-berlin.de/~structeng/wiki/doku.php?id=corpora:tagset-claws7&amp;rev=1718884382&amp;do=diff">
        <dc:format>text/html</dc:format>
        <dc:date>2024-06-20T11:53:02+00:00</dc:date>
        <dc:creator>Anonymous (anonymous@undisclosed.example.com)</dc:creator>
        <title>tagset-claws7</title>
        <link>https://userpage.fu-berlin.de/~structeng/wiki/doku.php?id=corpora:tagset-claws7&amp;rev=1718884382&amp;do=diff</link>
        <description>Tagsets: CLAWS 7

The CLAWS 7 tagset is used, for example, in the 2014 version of the British National Corpus (BNC 2014), and in the Corpus of Historical American English (COCA) and Corpus of Contemporary American English (COCA). However, the latter two corpora contain some modifications to and errors in the tagset, so please consult the page</description>
    </item>
    <item rdf:about="https://userpage.fu-berlin.de/~structeng/wiki/doku.php?id=corpora:tagset-penn-historical&amp;rev=1718884382&amp;do=diff">
        <dc:format>text/html</dc:format>
        <dc:date>2024-06-20T11:53:02+00:00</dc:date>
        <dc:creator>Anonymous (anonymous@undisclosed.example.com)</dc:creator>
        <title>tagset-penn-historical</title>
        <link>https://userpage.fu-berlin.de/~structeng/wiki/doku.php?id=corpora:tagset-penn-historical&amp;rev=1718884382&amp;do=diff</link>
        <description>Tagsets: Historical English Penn TreeBank tagset

The Historical English Penn TreeBank tagset is an adaptation of the Penn tagset for historical corpora of English.
 Tag  Description  .  sentence-final punctuation  ,  sentence-internal punctuation  &#039;</description>
    </item>
    <item rdf:about="https://userpage.fu-berlin.de/~structeng/wiki/doku.php?id=corpora:tagset-penn&amp;rev=1718884382&amp;do=diff">
        <dc:format>text/html</dc:format>
        <dc:date>2024-06-20T11:53:02+00:00</dc:date>
        <dc:creator>Anonymous (anonymous@undisclosed.example.com)</dc:creator>
        <title>tagset-penn</title>
        <link>https://userpage.fu-berlin.de/~structeng/wiki/doku.php?id=corpora:tagset-penn&amp;rev=1718884382&amp;do=diff</link>
        <description>Tagsets: Penn

The Penn tagset is widely used, although it is often modified to some extent (for example, in the Tree Tagger tagset). There is a document that describes in detail how the tagset should be applied (Santorini 1991). There are two versions of the tagset</description>
    </item>
    <item rdf:about="https://userpage.fu-berlin.de/~structeng/wiki/doku.php?id=corpora:tagset-stts-original&amp;rev=1718884382&amp;do=diff">
        <dc:format>text/html</dc:format>
        <dc:date>2024-06-20T11:53:02+00:00</dc:date>
        <dc:creator>Anonymous (anonymous@undisclosed.example.com)</dc:creator>
        <title>tagset-stts-original</title>
        <link>https://userpage.fu-berlin.de/~structeng/wiki/doku.php?id=corpora:tagset-stts-original&amp;rev=1718884382&amp;do=diff</link>
        <description>Tagsets: The Stuttgart-Tübingen-Tagset (STTS)

Most German corpora use the STTS, either in the basic version presented below, or with additions.
Tag  Description  Example  ADJA  attributives Adjektiv  der rote Ball; das verschneite Dorf  ADJD  adverbiales oder prädikatives Adjektiv</description>
    </item>
    <item rdf:about="https://userpage.fu-berlin.de/~structeng/wiki/doku.php?id=corpora:tagset-treetagger&amp;rev=1718884382&amp;do=diff">
        <dc:format>text/html</dc:format>
        <dc:date>2024-06-20T11:53:02+00:00</dc:date>
        <dc:creator>Anonymous (anonymous@undisclosed.example.com)</dc:creator>
        <title>tagset-treetagger</title>
        <link>https://userpage.fu-berlin.de/~structeng/wiki/doku.php?id=corpora:tagset-treetagger&amp;rev=1718884382&amp;do=diff</link>
        <description>Tagsets: Tree Tagger

The Tree Tagger tagset is a modified version of the Penn tagset. It is the default tagset used by the English version of the Tree Tagger software. Many of the specialized English corpora created in our workgroup use this tagset or some variant of it.</description>
    </item>
    <item rdf:about="https://userpage.fu-berlin.de/~structeng/wiki/doku.php?id=corpora:tagsets&amp;rev=1718884382&amp;do=diff">
        <dc:format>text/html</dc:format>
        <dc:date>2024-06-20T11:53:02+00:00</dc:date>
        <dc:creator>Anonymous (anonymous@undisclosed.example.com)</dc:creator>
        <title>tagsets</title>
        <link>https://userpage.fu-berlin.de/~structeng/wiki/doku.php?id=corpora:tagsets&amp;rev=1718884382&amp;do=diff</link>
        <description>Tagsets

Many corpora are annotated for word class -- for every word form in the corpus, there is a “pos” tag describing what part of speech it is (see Corpus Structure for more information).

There is no generally agreed-upon set of word classes and no generally agreed-upon way of referring to word classes</description>
    </item>
</rdf:RDF>
