public final class ICUTransformFilter extends TokenFilter
TokenFilter
that transforms text with ICU.
ICU provides text-transformation functionality via its Transliteration API. Although script conversion is its most common use, a Transliterator can actually perform a more general class of tasks. In fact, Transliterator defines a very general API which specifies only that a segment of the input text is replaced by new text. The particulars of this conversion are determined entirely by subclasses of Transliterator.
Some useful transformations for search are built-in:
Example usage:
stream = new ICUTransformFilter(stream, Transliterator.getInstance("Traditional-Simplified"));
AttributeSource.State
input
DEFAULT_TOKEN_ATTRIBUTE_FACTORY
Constructor and Description |
---|
ICUTransformFilter(TokenStream input,
com.ibm.icu.text.Transliterator transform)
Create a new ICUTransformFilter that transforms text on the given stream.
|
Modifier and Type | Method and Description |
---|---|
boolean |
incrementToken() |
close, end, reset
addAttribute, addAttributeImpl, captureState, clearAttributes, cloneAttributes, copyTo, endAttributes, equals, getAttribute, getAttributeClassesIterator, getAttributeFactory, getAttributeImplsIterator, hasAttribute, hasAttributes, hashCode, reflectAsString, reflectWith, removeAllAttributes, restoreState, toString
public ICUTransformFilter(TokenStream input, com.ibm.icu.text.Transliterator transform)
input
- TokenStream
to filter.transform
- Transliterator to transform the text.public boolean incrementToken() throws IOException
incrementToken
in class TokenStream
IOException
Copyright © 2000-2021 Apache Software Foundation. All Rights Reserved.