A part of Speech tagset for French language

French TreeTagger part-of-speech tagset is on the market in French corpora annotated by the software TreeTagger that was developed by Helmut Schmid within the Textual corpora undertaking on the Institute for Computational Linguistics of the College of Stuttgart.

Part of speech (POS) tagset for the French language defines the completely different classes or sorts of phrases that can be utilized in sentences and the way they need to be labelled when processed by linguistic instruments like part-of-speech taggers. The French language tagset could be fairly detailed with variations relying on the linguistic framework used (e.g., Common Dependencies, Treebank POS tags, and so on.). Right here’s an summary of the primary classes present in a typical French POS tagset:

1. Nouns (N)

  • NN: Frequent noun (e.g., chat, maison).
  • NNS: Plural noun (e.g., chats, maisons).
  • NC: Countable noun.
  • NOMPROP: Correct noun (e.g., Paris, Marie).

2. Pronouns (PRON)

  • PRP: Private pronoun (e.g., je, tu, il).
  • PRP$: Possessive pronoun (e.g., mon, ton, son).
  • REFL: Reflexive pronoun (e.g., se, me).

3. Verbs (V)

  • VB: Base type of a verb (infinitive, e.g., manger).
  • VBD: Previous tense verb (e.g., mangait).
  • VBG: Gerund/participle kind (e.g., mangeant).
  • VBN: Previous participle (e.g., mangé).
  • VBP: Current tense verb, non-Third particular person singular (e.g., je mange).
  • VBZ: Third particular person singular current (e.g., il mange).

4. Adjectives (ADJ)

  • JJ: Adjective (e.g., grand, beau).
  • JJR: Comparative adjective (e.g., plus grand).
  • JJS: Superlative adjective (e.g., le plus grand).

5. Adverbs (ADV)

  • RB: Adverb (e.g., rapidement, très).
  • RBR: Comparative adverb (e.g., plus rapidement).
  • RBS: Superlative adverb (e.g., le plus rapidement).

6. Determiners (DET)

  • DT: Determiner (e.g., le, une).
  • PDT: Predeterminer (e.g., quelques).
  • WDT: Wh-determiner (e.g., quel).

7. Prepositions (PREP)

  • IN: Preposition (e.g., dans, avec).

8. Conjunctions (CONJ)

  • CC: Coordinating conjunction (e.g., et, mais).
  • IN: Subordinating conjunction (e.g., que, si).

9. Interjections (INTJ)

  • UH: Interjection (e.g., oh, aïe).

10. Auxiliary Verbs (AUX)

  • AUX: Auxiliary verb (e.g., être, avoir).
  • AUXP: Auxiliary previously participle (e.g., était).

11. Numbers (NUM)

  • CD: Cardinal quantity (e.g., un, deux).
  • OD: Ordinal quantity (e.g., premier, deuxième).

12. Symbols and Punctuation (SYM, PUNCT)

  • SYM: Image (e.g., &, %, $).
  • PUNCT: Punctuation (e.g., ., !, ?).

13. Different

  • X: Different, not categorized (typically used for phrases or expressions not becoming normal classes).
  • FW: International phrase (e.g., pizza, memento).

Tagset Variations

Totally different linguistic assets might need slight variations within the POS tagset. For instance:

  • Common Dependencies (UD): A standardized tagset for a lot of languages, together with French, which simplifies POS labels (e.g., NOUN, VERB, ADJ, ADV).
  • French Treebank: An in depth POS tagset that features particular distinctions between sorts of nouns, verbs, and modifiers.
  • CLAWS (Constituent Labeled Annotated Phrase Sense): Used for extra nuanced tagging, typically in analysis and language expertise growth.

When working with French language processing, it’s important to decide on a tagset that matches your linguistic evaluation wants or the precise software you’re utilizing. An Instance of a tag within the CQL concordance search field[tag="VER:cond"] searches all verb conditionals, e.g. serait, pourrait (be aware: please just remember to use straight double citation marks)

French TreeTagger part-of-speech tagset

French Language Tagset

Tag Description
ABR abreviation
ADJ adjective
ADV adverb
DET:ART article
DET:POS possessive pronoun (ma, ta, …)
INT interjection
KON conjunction
NAM correct identify
NOM noun
NUM numeral
PRO pronoun
PRO:DEM demonstrative pronoun
PRO:IND indefinite pronoun
PRO:PER private pronoun
PRO:POS possessive pronoun (mien, tien, …)
PRO:REL relative pronoun
PRP preposition
PRP:det preposition plus article (au,du,aux,des)
PUN punctuation
PUN:cit punctuation quotation
SENT sentence tag
SYM image
VER:cond verb conditional
VER:futu verb futur
VER:impe verb crucial
VER:impf verb imperfect
VER:infi verb infinitive
VER:pper verb previous participle
VER:ppre verb current participle
VER:pres verb current
VER:simp verb easy previous
VER:subi verb subjunctive imperfect
VER:subp verb subjunctive current

Supply: https://www.cis.uni-muenchen.de/~schmid/instruments/TreeTagger/knowledge/french-tagset.html