====== Tagsets: Penn ====== The Penn tagset is widely used, although it is often modified to some extent (for example, in the [[corpora:tagset-treetagger|Tree Tagger]] tagset). There is a document that describes in detail how the tagset should be applied (Santorini 1991). There are two versions of the tagset -- the original, and a version for the Penn Treebank (a corpus that includes information about grammatical structure). In the latter version, the tags NP (proper name) and PP (personal pronoun) were modified to make them different from the grammatical labels NP (noun phrase) and PP (prepositional phrase). ^ Original ^Treebank ^ Description ^ | CC | | Coordinating conjunction | | CD | | Cardinal number | | DT | | Determiner | | EX | | Existential there | | FW | | Foreign word | | IN | | Preposition or subordinating conjunction | | JJ | | Adjective | | JJR | | Adjective, comparative | | JJS | | Adjective, superlative | | LS | | List item marker | | MD | | Modal | | NN | | Noun, singular or mass | | NNS | | Noun, plural | | NP | NNP | Proper noun, singular | | NPS | NNPS | Proper noun, plural | | PDT | | Predeterminer | | POS | | Possessive ending | | PP | PRP | Personal pronoun | | PP$ | PRP$ | Possessive pronoun | | RB | | Adverb | | RBR | | Adverb, comparative | | RBS | | Adverb, superlative | | RP | | Particle | | SYM | | Symbol | | TO | | to | | UH | | Interjection | | VB | | Verb, base form | | VBD | | Verb, past tense | | VBG | | Verb, gerund or present participle | | VBN | | Verb, past participle | | VBP | | Verb, non-3rd person singular present | | VBZ | | Verb, 3rd person singular present | | WDT | | Wh-determiner | | WP | | Wh-pronoun | | WP$ | | Possessive wh-pronoun | | WRB | | Wh-adverb | === Reference === Beatrice Santorini. 1991. [[https://repository.upenn.edu/cis_reports/570/|Part-of-speech tagging guidelines for the Penn Treebank Project (3rd Revision)]]. Technical Report No. MS-CIS-90-47. University of Pennsylvania Department of Computer and Information Science.