Stemming
Also known as: Word Stemming, Suffix Stripping
Stemming is a natural-language-processing technique that reduces inflected or derived words to their base or root form — 'running', 'runs', and 'ran' all map to the stem 'run'. The Porter stemmer (1980) is the canonical example for English. Stemming helps information-retrieval and text-analysis systems treat related word forms as equivalent, which is useful in accessibility applications such as search-within-page features, semantic text-entry prediction, and simplification pipelines for cognitive accessibility. A more linguistically sophisticated alternative is lemmatization, which uses vocabulary and morphological analysis to return a proper dictionary form rather than a truncated stem.
Category: Natural Language Processing · Research Methods
Related: Natural Language Processing · Part-of-Speech Tagging