public class FuzzyQuery extends MultiTermQuery
false
to the transpositions
parameter.
This query uses MultiTermQuery.TopTermsBlendedFreqScoringRewrite
as default. So terms will be collected and scored according to their
edit distance. Only the top terms are used for building the BooleanQuery
.
It is not recommended to change the rewrite mode for fuzzy queries.
At most, this query will match terms up to 2 edits. Higher distances (especially with transpositions enabled), are generally not useful and will match a significant amount of the term dictionary. If you really want this, consider using an n-gram indexing technique (such as the SpellChecker in the suggest module) instead.
NOTE: terms of length 1 or 2 will sometimes not match because of how the scaled distance between two terms is computed. For a term to match, the edit distance between the terms must be less than the minimum length term (either the input term, or the candidate term). For example, FuzzyQuery on term "abcd" with maxEdits=2 will not match an indexed term "ab", and FuzzyQuery on term "a" with maxEdits=2 will not match an indexed term "abc".
MultiTermQuery.RewriteMethod, MultiTermQuery.TopTermsBlendedFreqScoringRewrite, MultiTermQuery.TopTermsBoostOnlyBooleanQueryRewrite, MultiTermQuery.TopTermsScoringBooleanQueryRewrite
Modifier and Type | Field and Description |
---|---|
static int |
defaultMaxEdits |
static int |
defaultMaxExpansions |
static float |
defaultMinSimilarity
Deprecated.
pass integer edit distances instead.
|
static int |
defaultPrefixLength |
static boolean |
defaultTranspositions |
CONSTANT_SCORE_BOOLEAN_REWRITE, CONSTANT_SCORE_REWRITE, field, rewriteMethod, SCORING_BOOLEAN_REWRITE
Constructor and Description |
---|
FuzzyQuery(Term term)
|
FuzzyQuery(Term term,
int maxEdits)
|
FuzzyQuery(Term term,
int maxEdits,
int prefixLength)
|
FuzzyQuery(Term term,
int maxEdits,
int prefixLength,
int maxExpansions,
boolean transpositions)
Create a new FuzzyQuery that will match terms with an edit distance
of at most
maxEdits to term . |
Modifier and Type | Method and Description |
---|---|
boolean |
equals(Object obj)
Override and implement query instance equivalence properly in a subclass.
|
static int |
floatToEdits(float minimumSimilarity,
int termLen)
Deprecated.
pass integer edit distances instead.
|
CompiledAutomaton |
getAutomata()
Returns the compiled automata used to match terms
|
int |
getMaxEdits() |
int |
getPrefixLength()
Returns the non-fuzzy prefix length.
|
Term |
getTerm()
Returns the pattern term.
|
protected TermsEnum |
getTermsEnum(Terms terms,
AttributeSource atts)
Construct the enumeration to be used, expanding the
pattern term.
|
boolean |
getTranspositions()
Returns true if transpositions should be treated as a primitive edit operation.
|
int |
hashCode()
Override and implement query hash code properly in a subclass.
|
String |
toString(String field)
Prints a query to a string, with
field assumed to be the
default field and omitted. |
void |
visit(QueryVisitor visitor)
Recurse through the query tree, visiting any child queries
|
getField, getRewriteMethod, getTermsEnum, rewrite, setRewriteMethod
classHash, createWeight, sameClassAs, toString
public static final int defaultMaxEdits
public static final int defaultPrefixLength
public static final int defaultMaxExpansions
public static final boolean defaultTranspositions
@Deprecated public static final float defaultMinSimilarity
public FuzzyQuery(Term term, int maxEdits, int prefixLength, int maxExpansions, boolean transpositions)
maxEdits
to term
.
If a prefixLength
> 0 is specified, a common prefix
of that length is also required.term
- the term to search formaxEdits
- must be >= 0
and <=
LevenshteinAutomata.MAXIMUM_SUPPORTED_DISTANCE
.prefixLength
- length of common (non-fuzzy) prefixmaxExpansions
- the maximum number of terms to match. If this number is
greater than BooleanQuery.getMaxClauseCount()
when the query is rewritten,
then the maxClauseCount will be used instead.transpositions
- true if transpositions should be treated as a primitive
edit operation. If this is false, comparisons will implement the classic
Levenshtein algorithm.public FuzzyQuery(Term term, int maxEdits, int prefixLength)
public FuzzyQuery(Term term, int maxEdits)
public FuzzyQuery(Term term)
public int getMaxEdits()
public int getPrefixLength()
public boolean getTranspositions()
public CompiledAutomaton getAutomata()
public void visit(QueryVisitor visitor)
Query
protected TermsEnum getTermsEnum(Terms terms, AttributeSource atts) throws IOException
MultiTermQuery
TermsEnum.EMPTY
if no
terms match). The TermsEnum must already be
positioned to the first matching term.
The given AttributeSource
is passed by the MultiTermQuery.RewriteMethod
to
share information between segments, for example TopTermsRewrite
uses
it to share maximum competitive boostsgetTermsEnum
in class MultiTermQuery
IOException
public Term getTerm()
public String toString(String field)
Query
field
assumed to be the
default field and omitted.public int hashCode()
Query
QueryCache
works properly.hashCode
in class MultiTermQuery
Query.equals(Object)
public boolean equals(Object obj)
Query
QueryCache
works properly.
Typically a query will be equal to another only if it's an instance of
the same class and its document-filtering properties are identical that other
instance. Utility methods are provided for certain repetitive code.equals
in class MultiTermQuery
Query.sameClassAs(Object)
,
Query.classHash()
@Deprecated public static int floatToEdits(float minimumSimilarity, int termLen)
minimumSimilarity
- scaled similaritytermLen
- length (in unicode codepoints) of the term.Copyright © 2000-2021 Apache Software Foundation. All Rights Reserved.