Word Parsing

16 - Morphological Tagger

Assigning additional morphological information clarifies the grammatical meaning of a word, additionally to the syntax.

The task of the Morphological Tagger is to assign additional morphological information to each token. This can be the gender, case, person, tense, etc. UniMorph and UniversalDependencies are projects to describe al morphological features in a universal schema.

` **FIN**` **** indicates a finite verb. `**IND**` **** indicates the indicative mood. `**PFV**` **** indicates the perfective aspect. `**PST**` **** indicates the past tense. **2** indicates the second person. **SG** indicates the singular number. `**INFM**` **** indicates the informal register. (source)

The Spanish word ‘hablaste’ can be represented as the lemma ‘hablar’ plus the bundle [FIN;IND;PFV;PST;2;SG;INFM]. This bundle describes all grammatical features for the word. The [V] stands for Verb and is the Part-of-Speech.

Some languages (like German and Arabic) mark a lot of information through morphology, giving them a rich morphology. Other languages (like Mandarin) have almost no morphology, so whatever needs to be encoded gets encoded through syntax. Syntax means: by adding more words and by changing word order.

This article is part of the project Periodic Table of NLP Tasks. Click to read more about the making of the Periodic Table and the project to systemize NLP tasks.