French TreeTagger part-of-speech tagset is on the market in French corpora annotated by the software TreeTagger that was developed by Helmut Schmid within the Textual corpora undertaking on the Institute for Computational Linguistics of the College of Stuttgart.
Part of speech (POS) tagset for the French language defines the completely different classes or sorts of phrases that can be utilized in sentences and the way they need to be labelled when processed by linguistic instruments like part-of-speech taggers. The French language tagset could be fairly detailed with variations relying on the linguistic framework used (e.g., Common Dependencies, Treebank POS tags, and so on.). Right here’s an summary of the primary classes present in a typical French POS tagset:
1. Nouns (N)
- NN: Frequent noun (e.g., chat, maison).
- NNS: Plural noun (e.g., chats, maisons).
- NC: Countable noun.
- NOMPROP: Correct noun (e.g., Paris, Marie).
2. Pronouns (PRON)
- PRP: Private pronoun (e.g., je, tu, il).
- PRP$: Possessive pronoun (e.g., mon, ton, son).
- REFL: Reflexive pronoun (e.g., se, me).
3. Verbs (V)
- VB: Base type of a verb (infinitive, e.g., manger).
- VBD: Previous tense verb (e.g., mangait).
- VBG: Gerund/participle kind (e.g., mangeant).
- VBN: Previous participle (e.g., mangé).
- VBP: Current tense verb, non-Third particular person singular (e.g., je mange).
- VBZ: Third particular person singular current (e.g., il mange).
4. Adjectives (ADJ)
- JJ: Adjective (e.g., grand, beau).
- JJR: Comparative adjective (e.g., plus grand).
- JJS: Superlative adjective (e.g., le plus grand).
5. Adverbs (ADV)
- RB: Adverb (e.g., rapidement, très).
- RBR: Comparative adverb (e.g., plus rapidement).
- RBS: Superlative adverb (e.g., le plus rapidement).
6. Determiners (DET)
- DT: Determiner (e.g., le, une).
- PDT: Predeterminer (e.g., quelques).
- WDT: Wh-determiner (e.g., quel).
7. Prepositions (PREP)
- IN: Preposition (e.g., dans, avec).
8. Conjunctions (CONJ)
- CC: Coordinating conjunction (e.g., et, mais).
- IN: Subordinating conjunction (e.g., que, si).
9. Interjections (INTJ)
- UH: Interjection (e.g., oh, aïe).
10. Auxiliary Verbs (AUX)
- AUX: Auxiliary verb (e.g., être, avoir).
- AUXP: Auxiliary previously participle (e.g., était).
11. Numbers (NUM)
- CD: Cardinal quantity (e.g., un, deux).
- OD: Ordinal quantity (e.g., premier, deuxième).
12. Symbols and Punctuation (SYM, PUNCT)
- SYM: Image (e.g., &, %, $).
- PUNCT: Punctuation (e.g., ., !, ?).
13. Different
- X: Different, not categorized (typically used for phrases or expressions not becoming normal classes).
- FW: International phrase (e.g., pizza, memento).
Tagset Variations
Totally different linguistic assets might need slight variations within the POS tagset. For instance:
- Common Dependencies (UD): A standardized tagset for a lot of languages, together with French, which simplifies POS labels (e.g.,
NOUN
,VERB
,ADJ
,ADV
). - French Treebank: An in depth POS tagset that features particular distinctions between sorts of nouns, verbs, and modifiers.
- CLAWS (Constituent Labeled Annotated Phrase Sense): Used for extra nuanced tagging, typically in analysis and language expertise growth.
When working with French language processing, it’s important to decide on a tagset that matches your linguistic evaluation wants or the precise software you’re utilizing. An Instance of a tag within the CQL concordance search field: [tag="VER:cond"]
searches all verb conditionals, e.g. serait, pourrait (be aware: please just remember to use straight double citation marks)
French TreeTagger part-of-speech tagset
French Language Tagset
Tag | Description |
ABR | abreviation |
ADJ | adjective |
ADV | adverb |
DET:ART | article |
DET:POS | possessive pronoun (ma, ta, …) |
INT | interjection |
KON | conjunction |
NAM | correct identify |
NOM | noun |
NUM | numeral |
PRO | pronoun |
PRO:DEM | demonstrative pronoun |
PRO:IND | indefinite pronoun |
PRO:PER | private pronoun |
PRO:POS | possessive pronoun (mien, tien, …) |
PRO:REL | relative pronoun |
PRP | preposition |
PRP:det | preposition plus article (au,du,aux,des) |
PUN | punctuation |
PUN:cit | punctuation quotation |
SENT | sentence tag |
SYM | image |
VER:cond | verb conditional |
VER:futu | verb futur |
VER:impe | verb crucial |
VER:impf | verb imperfect |
VER:infi | verb infinitive |
VER:pper | verb previous participle |
VER:ppre | verb current participle |
VER:pres | verb current |
VER:simp | verb easy previous |
VER:subi | verb subjunctive imperfect |
VER:subp | verb subjunctive current |
Supply: https://www.cis.uni-muenchen.de/~schmid/instruments/TreeTagger/knowledge/french-tagset.html