User Tools

Site Tools


corpora:tagset-claws7

Tagsets: CLAWS 7

The CLAWS 7 tagset is used, for example, in the 2014 version of the British National Corpus (BNC 2014), and in the Corpus of Historical American English (COCA) and Corpus of Contemporary American English (COCA). However, the latter two corpora contain some modifications to and errors in the tagset, so please consult the page Tagsets: CLAWS 7 (COCA/COHA version).

Tag Description
APPGE possessive pronoun, pre-nominal (my, your, his, her, its, our, their)
AT article (e.g. the, no)
AT1 singular article (e.g. a, an, every)
BCL before-clause marker (e.g. in order (that), in order (to); see comment about "ditto tags" at the end of the list!)
CC coordinating conjunction (e.g. and, or)
CCB adversative coordinating conjunction (but)
CS subordinating conjunction (e.g. if, because, unless, so, for)
CSA as (when used as conjunction)
CSN than (when used as a conjunction)
CST that (when used as a conjunction)
CSW whether (when used as a conjunction)
DA after-determiner or post-determiner capable of pronominal function (e.g. such, former, same)
DA1 singular after-determiner (e.g. little, much)
DA2 plural after-determiner (e.g. few, several, many)
DAR comparative after-determiner (e.g. more, less, fewer)
DAT superlative after-determiner (e.g. most, least, fewest)
DB before determiner or pre-determiner capable of pronominal function (all, half)
DB2 plural before-determiner (both)
DD determiner (capable of pronominal function) (e.g any, some)
DD1 singular determiner (e.g. this, that, another)
DD2 plural determiner (these, those)
DDQ wh-determiner (which, what)
DDQGE wh-determiner, genitive (whose)
DDQV wh-ever determiner (whichever, whatever)
EX existential there
FO formula
FU unclassified word
FW foreign word (e.g. de, la, aqua, chakra)
GE germanic genitive marker - (' or's)
IF for (when used as a preposition)
II general preposition (all prepositions except for, of, with, without)
IO of (when used as a preposition)
IW with, without (when used as a prepositions)
JJ general adjective (e.g. good, nice, lovely, different)
JJR general comparative adjective (e.g. better, nicer)
JJT general superlative adjective (e.g. best, nicest)
JK catenative adjective (able in be able to, willing in be willing to)
MC cardinal number, neutral for number (two, three, sixteen, …)
MC1 singular cardinal number (one)
MC2 plural cardinal number (e.g. sixes, sevens, twenties)
MCGE genitive cardinal number, neutral for number (two's, 100's)
MCMC hyphenated number (5-10, 1914-1918)
MD ordinal number (e.g. first, second, next, last)
MF fraction, neutral for number (e.g. quarters, two-thirds)
ND1 singular noun of direction (e.g. north, southwest)
NN common noun, neutral for number (e.g. people, staff, tuna, aircraft, series, ethics)
NN1 singular common noun (e.g. horse, girl, love, democracy)
NN2 plural common noun (e.g. horses, girls, democracies)
NNA following noun of title (e.g. M.A.)
NNB preceding noun of title (e.g. Mrs., Prof.)
NNL1 singular locative noun (e.g. Lake, Street, Hill)
NNL2 plural locative noun (e.g. Lakes, Streets, Hills)
NNO numeral noun, neutral for number (e.g. dozen, hundred)
NNO2 numeral noun, plural (e.g. hundreds, thousands)
NNT1 temporal noun, singular (e.g. day, week, year)
NNT2 temporal noun, plural (e.g. days, weeks, years)
NNU unit of measurement, neutral for number (e.g. mm, sec)
NNU1 singular unit of measurement (e.g. millimetre, second)
NNU2 plural unit of measurement (e.g. ins., feet)
NP proper noun, neutral for number (e.g. Philippines, Mercedes)
NP1 singular proper noun (e.g. Europe, BBC, Sarah)
NP2 plural proper noun (e.g. Himalayas, Beatles, Tudors)
NPD1 singular weekday noun (e.g. Friday)
NPD2 plural weekday noun (e.g. Fridays)
NPM1 singular month noun (e.g. September)
NPM2 plural month noun (e.g. Septembers)
PN indefinite pronoun, neutral for number (none)
PN1 indefinite pronoun, singular (e.g. anyone, everything, nobody, one)
PNQO objective wh-pronoun (whom)
PNQS subjective wh-pronoun (who)
PNQV wh-ever pronoun (whoever)
PNX1 reflexive indefinite pronoun (oneself)
PPGE nominal possessive personal pronoun (e.g. mine, yours)
PPH1 3rd person sing. neuter personal pronoun (it)
PPHO1 3rd person sing. objective personal pronoun (him, her)
PPHO2 3rd person plural objective personal pronoun (them)
PPHS1 3rd person sing. subjective personal pronoun (he, she)
PPHS2 3rd person plural subjective personal pronoun (they)
PPIO1 1st person sing. objective personal pronoun (me)
PPIO2 1st person plural objective personal pronoun (us)
PPIS1 1st person sing. subjective personal pronoun (I)
PPIS2 1st person plural subjective personal pronoun (we)
PPX1 singular reflexive personal pronoun (e.g. yourself, itself)
PPX2 plural reflexive personal pronoun (e.g. yourselves, themselves)
PPY 2nd person personal pronoun (you)
RA adverb, after nominal head (e.g. ago, am, pm)
REX adverb introducing appositional constructions (namely, i.e.)
RG degree adverb (very, so, too)
RGQ wh- degree adverb (how)
RGQV wh-ever degree adverb (however)
RGR comparative degree adverb (more, less)
RGT superlative degree adverb (most, least)
RL locative adverb (e.g. somewhere, forward, upstairs)
RP prep. adverb, particle (e.g up, out, back)
RPK prep. adv., catenative (e.g. about in be about to)
RR general adverb (e.g. just, actually, always)
RRQ wh- general adverb (where, when, why, how)
RRQV wh-ever general adverb (wherever, whenever)
RRR comparative general adverb (e.g. more, better, earlier)
RRT superlative general adverb (e.g. most, best, earliest)
RT quasi-nominal adverb of time (e.g. now, tomorrow)
TO infinitive marker (to)
UH interjection (e.g. oh, yes, um)
VB0 be, base form (finite i.e. imperative, subjunctive)
VBDR were
VBDZ was
VBG being
VBI be, infinitive (e.g. in I'll be wrapped around your finger, to be honest)
VBM am
VBN been
VBR are
VBZ is
VD0 do, base form (finite)
VDD did
VDG doing
VDI do, infinitive (e.g. in I could do…, To do…)
VDN done
VDZ does
VH0 have, base form (finite)
VHD had (past tense)
VHG having
VHI have, infinitive
VHN had (past participle)
VHZ has
VM modal auxiliary (can, will, would, etc.)
VMK modal catenative (ought, used)
VV0 base form of lexical verb (e.g. say, love)
VVD past tense of lexical verb (e.g. said, loved)
VVG -ing participle of lexical verb (e.g. saying, loving)
VVGK -ing participle catenative (going in be going to)
VVI infinitive (e.g. to say…, I will always love you…)
VVN past participle of lexical verb (e.g. given, worked)
VVNK past participle catenative (e.g. bound in be bound to)
VVZ -s form of lexical verb (e.g. says, loves)
XX not, n't
ZZ1 singular letter of the alphabet (e.g. A, b)
ZZ2 plural letter of the alphabet (e.g. A's, b's)
! punctuation tag - exclamation mark
" punctuation tag - quotation marks
( punctuation tag - left bracket
) punctuation tag - right bracket
, punctuation tag - comma
- punctuation tag - dash
. punctuation tag - full-stop
punctuation tag - ellipsis
: punctuation tag - colon
; punctuation tag - semi-colon
? punctuation tag - question-mark

Ditto Tags

The CLAWS 7 tagset uses so-called “ditto” tags for certain sequences of tokens that are analyzed as belonging to a single lexical unit. For example, in terms of is analyzed as a preposition (in CLAWS 5, by comparison, it is analyzed as a sequence of a preposition, a noun and another preposition).

In such cases, all words are given the same tag (in the case of in terms of the tag IN for preposition) followed by two digits: the first one specifying the length of the sequence, the second one specifying the position of the element in the sequence, for example

in/II31 terms/II32 of/II33
at_RR21 length_RR22 |
a_DD21 lot_DD22

This is unfortunate, as it forces analytical decisions on us that are not at all uncontroversial, but we have to live with it!

corpora/tagset-claws7.txt · Last modified: 2024/01/16 21:18 by astefanowitsch