public class RandomSamplingFacetsCollector extends FacetsCollector
Facets
subclasses to do the facet counting. Note that this collector
does not collect the scores of matching docs (i.e.
FacetsCollector.MatchingDocs.scores
) is null
.
If you require the original set of hits, you can call
getOriginalMatchingDocs()
. Also, since the counts of the top-facets
is based on the sampled set, you can amortize the counts by calling
amortizeFacetCounts(org.apache.lucene.facet.FacetResult, org.apache.lucene.facet.FacetsConfig, org.apache.lucene.search.IndexSearcher)
.
FacetsCollector.MatchingDocs
Constructor and Description |
---|
RandomSamplingFacetsCollector(int sampleSize)
Constructor with the given sample size and default seed.
|
RandomSamplingFacetsCollector(int sampleSize,
long seed)
Constructor with the given sample size and seed.
|
Modifier and Type | Method and Description |
---|---|
FacetResult |
amortizeFacetCounts(FacetResult res,
FacetsConfig config,
IndexSearcher searcher)
Note: if you use a counting
Facets implementation, you can amortize the
sampled counts by calling this method. |
List<FacetsCollector.MatchingDocs> |
getMatchingDocs()
Returns the sampled list of the matching documents.
|
List<FacetsCollector.MatchingDocs> |
getOriginalMatchingDocs()
Returns the original matching documents.
|
double |
getSamplingRate()
Returns the sampling rate that was used.
|
collect, doSetNextReader, getKeepScores, scoreMode, search, search, search, searchAfter, searchAfter, searchAfter, setScorer
getLeafCollector
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
getLeafCollector
competitiveIterator
public RandomSamplingFacetsCollector(int sampleSize)
RandomSamplingFacetsCollector(int, long)
public RandomSamplingFacetsCollector(int sampleSize, long seed)
sampleSize
- The preferred sample size. If the number of hits is greater than
the size, sampling will be done using a sample ratio of sampling
size / totalN. For example: 1000 hits, sample size = 10 results in
samplingRatio of 0.01. If the number of hits is lower, no sampling
is done at allseed
- The random seed. If 0
then a seed will be chosen for you.public List<FacetsCollector.MatchingDocs> getMatchingDocs()
FacetsCollector.MatchingDocs
instance is returned per segment, even
if no hits from that segment are included in the sampled set.
Note: One or more of the MatchingDocs might be empty (not containing any hits) as result of sampling.
Note: MatchingDocs.totalHits
is copied from the original
MatchingDocs, scores is set to null
getMatchingDocs
in class FacetsCollector
public List<FacetsCollector.MatchingDocs> getOriginalMatchingDocs()
public FacetResult amortizeFacetCounts(FacetResult res, FacetsConfig config, IndexSearcher searcher) throws IOException
Facets
implementation, you can amortize the
sampled counts by calling this method. Uses the FacetsConfig
and
the IndexSearcher
to determine the upper bound for each facet value.IOException
public double getSamplingRate()
Copyright © 2000-2021 Apache Software Foundation. All Rights Reserved.