public class MorfologikFilter extends TokenFilter
TokenFilter
using Morfologik library to transform input tokens into lemma and
morphosyntactic (POS) tokens. Applies to Polish only.
MorfologikFilter contains a MorphosyntacticTagsAttribute
, which provides morphosyntactic
annotations for produced lemmas. See the Morfologik documentation for details.
AttributeSource.State
DEFAULT_TOKEN_ATTRIBUTE_FACTORY
Constructor and Description |
---|
MorfologikFilter(TokenStream in)
Creates a filter with the default (Polish) dictionary.
|
MorfologikFilter(TokenStream in,
morfologik.stemming.Dictionary dict)
Creates a filter with a given dictionary.
|
Modifier and Type | Method and Description |
---|---|
boolean |
incrementToken()
Retrieves the next token (possibly from the list of lemmas).
|
void |
reset()
Resets stems accumulator and hands over to superclass.
|
close, end
addAttribute, addAttributeImpl, captureState, clearAttributes, cloneAttributes, copyTo, endAttributes, equals, getAttribute, getAttributeClassesIterator, getAttributeFactory, getAttributeImplsIterator, hasAttribute, hasAttributes, hashCode, reflectAsString, reflectWith, removeAllAttributes, restoreState, toString
public MorfologikFilter(TokenStream in)
public MorfologikFilter(TokenStream in, morfologik.stemming.Dictionary dict)
in
- input token stream.dict
- Dictionary to use for stemming.public final boolean incrementToken() throws IOException
incrementToken
in class TokenStream
IOException
public void reset() throws IOException
reset
in class TokenFilter
IOException
Copyright © 2000-2021 Apache Software Foundation. All Rights Reserved.