French TreeTagger part-of-speech tagset is out there in French corpora annotated by the device TreeTagger that was developed by Helmut Schmid within the Textual corpora challenge on the Institute for Computational Linguistics of the College of Stuttgart.
Part of speech (POS) tagset for the French language defines the completely different classes or varieties of phrases that can be utilized in sentences and the way they need to be labelled when processed by linguistic instruments like part-of-speech taggers. The French language tagset might be fairly detailed with variations relying on the linguistic framework used (e.g., Common Dependencies, Treebank POS tags, and so on.). Right here’s an summary of the primary classes present in a typical French POS tagset:
1. Nouns (N)
- NN: Frequent noun (e.g., chat, maison).
- NNS: Plural noun (e.g., chats, maisons).
- NC: Countable noun.
- NOMPROP: Correct noun (e.g., Paris, Marie).
2. Pronouns (PRON)
- PRP: Private pronoun (e.g., je, tu, il).
- PRP$: Possessive pronoun (e.g., mon, ton, son).
- REFL: Reflexive pronoun (e.g., se, me).
3. Verbs (V)
- VB: Base type of a verb (infinitive, e.g., manger).
- VBD: Previous tense verb (e.g., mangait).
- VBG: Gerund/participle type (e.g., mangeant).
- VBN: Previous participle (e.g., mangé).
- VBP: Current tense verb, non-Third particular person singular (e.g., je mange).
- VBZ: Third particular person singular current (e.g., il mange).
4. Adjectives (ADJ)
- JJ: Adjective (e.g., grand, beau).
- JJR: Comparative adjective (e.g., plus grand).
- JJS: Superlative adjective (e.g., le plus grand).
5. Adverbs (ADV)
- RB: Adverb (e.g., rapidement, très).
- RBR: Comparative adverb (e.g., plus rapidement).
- RBS: Superlative adverb (e.g., le plus rapidement).
6. Determiners (DET)
- DT: Determiner (e.g., le, une).
- PDT: Predeterminer (e.g., quelques).
- WDT: Wh-determiner (e.g., quel).
7. Prepositions (PREP)
- IN: Preposition (e.g., dans, avec).
8. Conjunctions (CONJ)
- CC: Coordinating conjunction (e.g., et, mais).
- IN: Subordinating conjunction (e.g., que, si).
9. Interjections (INTJ)
- UH: Interjection (e.g., oh, aïe).
10. Auxiliary Verbs (AUX)
- AUX: Auxiliary verb (e.g., être, avoir).
- AUXP: Auxiliary previously participle (e.g., était).
11. Numbers (NUM)
- CD: Cardinal quantity (e.g., un, deux).
- OD: Ordinal quantity (e.g., premier, deuxième).
12. Symbols and Punctuation (SYM, PUNCT)
- SYM: Image (e.g., &, %, $).
- PUNCT: Punctuation (e.g., ., !, ?).
13. Different
- X: Different, not categorised (typically used for phrases or expressions not becoming customary classes).
- FW: Overseas phrase (e.g., pizza, memento).
Tagset Variations
Totally different linguistic assets might need slight variations within the POS tagset. For instance:
- Common Dependencies (UD): A standardized tagset for a lot of languages, together with French, which simplifies POS labels (e.g.,
NOUN
,VERB
,ADJ
,ADV
). - French Treebank: An in depth POS tagset that features particular distinctions between varieties of nouns, verbs, and modifiers.
- CLAWS (Constituent Labeled Annotated Phrase Sense): Used for extra nuanced tagging, typically in analysis and language expertise growth.
When working with French language processing, it’s important to decide on a tagset that matches your linguistic evaluation wants or the precise device you’re utilizing. An Instance of a tag within the CQL concordance search field: [tag="VER:cond"]
searches all verb conditionals, e.g. serait, pourrait (notice: please just remember to use straight double citation marks)
French TreeTagger part-of-speech tagset
French Language Tagset
Tag | Description |
ABR | abreviation |
ADJ | adjective |
ADV | adverb |
DET:ART | article |
DET:POS | possessive pronoun (ma, ta, …) |
INT | interjection |
KON | conjunction |
NAM | correct identify |
NOM | noun |
NUM | numeral |
PRO | pronoun |
PRO:DEM | demonstrative pronoun |
PRO:IND | indefinite pronoun |
PRO:PER | private pronoun |
PRO:POS | possessive pronoun (mien, tien, …) |
PRO:REL | relative pronoun |
PRP | preposition |
PRP:det | preposition plus article (au,du,aux,des) |
PUN | punctuation |
PUN:cit | punctuation quotation |
SENT | sentence tag |
SYM | image |
VER:cond | verb conditional |
VER:futu | verb futur |
VER:impe | verb crucial |
VER:impf | verb imperfect |
VER:infi | verb infinitive |
VER:pper | verb previous participle |
VER:ppre | verb current participle |
VER:pres | verb current |
VER:simp | verb easy previous |
VER:subi | verb subjunctive imperfect |
VER:subp | verb subjunctive current |
Supply: https://www.cis.uni-muenchen.de/~schmid/instruments/TreeTagger/knowledge/french-tagset.html