public final class PorterStemFilter extends TokenFilter
To use this filter with other analyzers, you'll want to write an
Analyzer class that sets up the TokenStream chain as you want it.
To use this with LowerCaseTokenizer, for example, you'd write an
analyzer like this:
class MyAnalyzer extends Analyzer { @Override protected TokenStreamComponents createComponents(String fieldName) { Tokenizer source = new LowerCaseTokenizer(version, reader); return new TokenStreamComponents(source, new PorterStemFilter(source)); } }
Note: This filter is aware of the KeywordAttribute
. To prevent
certain terms from being passed to the stemmer
KeywordAttribute.isKeyword()
should be set to true
in a previous TokenStream
.
Note: For including the original term as well as the stemmed version, see
KeywordRepeatFilterFactory
AttributeSource.State
input
DEFAULT_TOKEN_ATTRIBUTE_FACTORY
Constructor and Description |
---|
PorterStemFilter(TokenStream in) |
Modifier and Type | Method and Description |
---|---|
boolean |
incrementToken() |
close, end, reset
addAttribute, addAttributeImpl, captureState, clearAttributes, cloneAttributes, copyTo, endAttributes, equals, getAttribute, getAttributeClassesIterator, getAttributeFactory, getAttributeImplsIterator, hasAttribute, hasAttributes, hashCode, reflectAsString, reflectWith, removeAllAttributes, restoreState, toString
public PorterStemFilter(TokenStream in)
public final boolean incrementToken() throws IOException
incrementToken
in class TokenStream
IOException
Copyright © 2000-2021 Apache Software Foundation. All Rights Reserved.