Italian in a contrastive perspective
This action may take several minutes for large corpora, please wait.

Word list options

Search attribute:
. Value of n: from to
Filter options:
Filter word list by:Regular expression:
Minimum frequency:
Maximum frequency: (0 = no maximum frequency)
Blacklist: format
Word list whitelists and blacklists must be plain text (.txt), encoded in UTF-8, with one item per line. The items must correspond to the selected attribute, so, eg, if 'lemma' is selected from the attribute menu, then the list should be a list of lemmas. We use exact matching, not regular-expression matching, for file input.
Output options:
Frequency figures:
Output type:
Reference (sub)corpus
Prefer: rare words
common words

You can select one or more output attributes. Please note that this option can be time-consuming.

Part-of-speech Tagset

ADVadverb (excluding -mente forms)
ADV:menteadverb ending in -mente
ARTPREpreposition + article
AUX:finfinite form of auxiliary
AUX:fin:clifinite form of auxiliary with clitic
AUX:gerugerundive form of auxiliary
AUX:geru:cligerundive form of auxiliary with clitic
AUX:infiinfinitival form of auxiliary
AUX:infi:cliinfinitival form of auxiliary with clitic
AUX:ppastpast participle of auxiliary
AUX:pprepresent participle of auxiliary
DET:demodemonstrative determiner
DET:indefindefinite determiner
DET:numnumeral determiner
DET:posspossessive determiner
DET:whwh determiner
NOCATnon-linguistic element
NPRproper noun
PRO:demodemonstrative pronoun
PRO:indefindefinite pronoun
PRO:numnumeral pronoun
PRO:perspersonal pronoun
PRO:posspossessive pronoun
PUNnon-sentence-final punctuation mark
SENTsentence-final punctuation mark
VER2:finfinite form of modal/causal verb
VER2:fin:clifinite form of modal/causal verb with clitic
VER2:gerugerundive form of modal/causal verb
VER2:geru:cligerundive form of modal/causal verb with clitic
VER2:infiinfinitival form of modal/causal verb
VER2:infi:cliinfinitival form of modal/causal verb with clitic
VER2:ppastpast participle of modal/causal verb
VER2:pprepresent participle of modal/causal verb
VER:finfinite form of verb
VER:fin:clifinite form of verb with clitic
VER:gerugerundive form of verb
VER:geru:cligerundive form of verb with clitic
VER:infiinfinitival form of verb
VER:infi:cliinfinitival form of verb with clitic
VER:ppastpast participle of verb
VER:ppast:clipast participle of verb with clitic
VER:pprepresent participle of verb
WHwh word

Document name format

Each document in CONTRAST-IT corpora is a newspaper article.
Document names are 18 character unique strings that contain 5 fields separated by underscore in the following format:
[Collection name]_[Corpus language]_[Newspaper]_[Section]_[ID]

For example, document cnt_it_rep_spo_005 belongs to CONTRAST-IT Italian corpus (cnt_it), to the newspaper La Repubblica (rep), section Sport (spo) and its ID is 005.