Table of Contents
[ Collection: Introduction to CQP ]
Formulating Complex Queries – Solutions
Exercise 1
[class="ADJ"][hw="snow|rain" & class="SUBST"]
(82 matches)
[pos="AJ."][hw="snow|rain" & pos="NN."]
(63 matches)
Exercise 2
[word="going" %c][hw="to"][class="VERB"]
(1583 matches)
Exercise 3
[pos="XX0"][word="about"][word="to"]
(6 matches)
Exercise 4
The exact name of the of the value of the structural attribute is W:ac:medicine
.
This query should return 61 matches in the BNC-BABY:
[hw="heart"] :: match.text_genre="W:ac:medicine"
Which common constructions can be seen in the matches?
From a cursory look at the concordance, one could see that “heart failure” is common in medical texts, whereas constructions such has “her heart sank”, “her heart leaped”, “her heart dropped” seem more common in prose.
Exercise 5
There are two ways to match past participles via pos-tags in the BNC-BABY, as the CLAWS-5 tagset differentiates between the past participle of the verbs to have VHN
, to be VBN
and to do VDN
and the participle of “lexical verbs” VVN
, i.e. all others.
Depending on whether you included the past participles of non-lexical verbs or not, you might end up with either one of the constructions.
[hw="have"][pos="VVN"]
(22699 matches)
[hw="have"][pos="VVN|VHN|VBN|VDN"]
(33527 matches)
To generate a frequency list of lemmas in the participle slot: count Last by word %c on match[1]
First ten entries of the frequency list of the first query:
4541 got 808 gone 590 seen 477 come 476 made 441 taken 391 said 304 become 229 given 225 told
First ten entries of the frequency list of the second query:
8822 been 4541 got 1155 had 850 done 808 gone 590 seen 477 come 476 made 441 taken 391 said
Exercise 6
This query may have to be adjusted gradually, these are possible ways to start out:
[hw="drive"][pos="PNP|NP0"]
(92 matches)
[hw="drive"][class="SUBST"][class="ADJ"]
(53 matches)
The previous queries match too many sentences that aren’t instantiations of the constructions. This is a possible way to narrow it down to people in the noun slot. Note that PNP
is the pos-tag for personal pronouns and NP0
is the pos-tag for proper nouns, like Michael or NHS:
[hw="drive"][pos="PNP|NP0"][class="ADJ"]
(13 matches)
Which of the construction’s slots remain the same, which are variable?
The lemma drive stays the same, the host-class is adjective. Note, however, that this query would not match instantiations like “driving me bananas”.
What is the meaning of the construction?
To cause someone to “lose their mind”, as it were, to be angry or upset.
Exercise 7
Query for venomous: [hw="venomous"][class="SUBST"]
(91 matches)
First ten entries of the frequency list for venomous:
count Last by hw on match[1] 18 snake 7 animal 5 attack 4 bite 3 look 3 shot 3 spine 2 creature 2 dart 2 glance
Query and first ten entries of the frequency list for toxic (956 matches):
[hw="toxic"][class="SUBST"] 286 waste 80 chemical 46 substance 45 shock 43 effect 28 gas 26 fume 16 material 16 metal 15 emission
Query and first ten entries of the frequency list for poisonous (290 matches):
[hw="poisonous"][class="SUBST"] 31 gas 15 snake 13 plant 13 substance 10 chemical 9 waste 8 animal 7 creature 6 fume 5 atmosphere 4 bite 4 secretion
Compare lists. What differences do you see?
- venoumous tends to occur with animals or animal-related words
- toxic tends to occur with substances and inanimate objects
- poisonous tends to occur with both substances and animals
More exercises
Continue here: 6b. Queries with Repetition Operators (RegEx).