keyboard_arrow_up
Design of a Rule Based Hindi Lemmatizer

Authors

Snigdha Paul, Mini Tandon, Nisheeth Joshi and Iti Mathur, Banasthali University, India

Abstract

Stemming is the process of clipping off the affixes from the input word to obtain the respective root word, but it is not necessary that stemming provide us the genuine and meaningful root word. To overcome this problem we come up with a solution- Lemmatizer. It is the process by which we crave out the lemma from the given word and can also add additional rules to make the clipped word a proper stem. In this paper we have created an inflectional lemmatizer which generates the rules for extracting the suffixes and also added rules for generating a proper meaningful root word.

Keywords

Stemming, Lemmatization, Lemma, Hindi, Over-stemming and Under-stemming.

Full Text  Volume 3, Number 4