public abstract class MergePolicy extends Object
Expert: a MergePolicy determines the sequence of primitive merge operations.
Whenever the segments in an index have been altered by
IndexWriter
, either the addition of a newly
flushed segment, addition of many segments from
addIndexes* calls, or a previous merge that may now need
to cascade, IndexWriter
invokes findMerges(org.apache.lucene.index.MergeTrigger, org.apache.lucene.index.SegmentInfos, org.apache.lucene.index.MergePolicy.MergeContext)
to give the MergePolicy a chance to pick
merges that are now required. This method returns a
MergePolicy.MergeSpecification
instance describing the set of
merges that should be done, or null if no merges are
necessary. When IndexWriter.forceMerge is called, it calls
findForcedMerges(SegmentInfos, int, Map, MergeContext)
and the MergePolicy should
then return the necessary merges.
Note that the policy can return more than one merge at
a time. In this case, if the writer is using SerialMergeScheduler
, the merges will be run
sequentially but if it is using ConcurrentMergeScheduler
they will be run concurrently.
The default MergePolicy is TieredMergePolicy
.
Modifier and Type | Class and Description |
---|---|
static class |
MergePolicy.MergeAbortedException
Thrown when a merge was explicitly aborted because
IndexWriter.abortMerges() was called. |
static interface |
MergePolicy.MergeContext
This interface represents the current context of the merge selection process.
|
static class |
MergePolicy.MergeException
Exception thrown if there are any problems while executing a merge.
|
static class |
MergePolicy.MergeSpecification
A MergeSpecification instance provides the information
necessary to perform multiple merges.
|
static class |
MergePolicy.OneMerge
OneMerge provides the information necessary to perform
an individual primitive merge operation, resulting in
a single new segment.
|
static class |
MergePolicy.OneMergeProgress
Progress and state for an executing merge.
|
Modifier and Type | Field and Description |
---|---|
protected static long |
DEFAULT_MAX_CFS_SEGMENT_SIZE
Default max segment size in order to use compound file system.
|
protected static double |
DEFAULT_NO_CFS_RATIO
Default ratio for compound file system usage.
|
protected long |
maxCFSSegmentSize
If the size of the merged segment exceeds
this value then it will not use compound file format.
|
protected double |
noCFSRatio
If the size of the merge segment exceeds this ratio of
the total index size then it will remain in
non-compound format
|
Modifier | Constructor and Description |
---|---|
|
MergePolicy()
Creates a new merge policy instance.
|
protected |
MergePolicy(double defaultNoCFSRatio,
long defaultMaxCFSSegmentSize)
Creates a new merge policy instance with default settings for noCFSRatio
and maxCFSSegmentSize.
|
Modifier and Type | Method and Description |
---|---|
protected boolean |
assertDelCount(int delCount,
SegmentCommitInfo info)
Asserts that the delCount for this SegmentCommitInfo is valid
|
abstract MergePolicy.MergeSpecification |
findForcedDeletesMerges(SegmentInfos segmentInfos,
MergePolicy.MergeContext mergeContext)
Determine what set of merge operations is necessary in order to expunge all
deletes from the index.
|
abstract MergePolicy.MergeSpecification |
findForcedMerges(SegmentInfos segmentInfos,
int maxSegmentCount,
Map<SegmentCommitInfo,Boolean> segmentsToMerge,
MergePolicy.MergeContext mergeContext)
Determine what set of merge operations is necessary in
order to merge to
<= the specified segment count. |
MergePolicy.MergeSpecification |
findFullFlushMerges(MergeTrigger mergeTrigger,
SegmentInfos segmentInfos,
MergePolicy.MergeContext mergeContext)
Identifies merges that we want to execute (synchronously) on commit.
|
abstract MergePolicy.MergeSpecification |
findMerges(MergeTrigger mergeTrigger,
SegmentInfos segmentInfos,
MergePolicy.MergeContext mergeContext)
Determine what set of merge operations are now necessary on the index.
|
double |
getMaxCFSSegmentSizeMB()
Returns the largest size allowed for a compound file segment
|
double |
getNoCFSRatio()
Returns current
noCFSRatio . |
protected boolean |
isMerged(SegmentInfos infos,
SegmentCommitInfo info,
MergePolicy.MergeContext mergeContext)
Returns true if this single info is already fully merged (has no
pending deletes, is in the same dir as the
writer, and matches the current compound file setting
|
boolean |
keepFullyDeletedSegment(IOSupplier<CodecReader> readerIOSupplier)
Returns true if the segment represented by the given CodecReader should be keep even if it's fully deleted.
|
protected void |
message(String message,
MergePolicy.MergeContext mergeContext)
Print a debug message to
MergePolicy.MergeContext 's infoStream . |
int |
numDeletesToMerge(SegmentCommitInfo info,
int delCount,
IOSupplier<CodecReader> readerSupplier)
Returns the number of deletes that a merge would claim on the given segment.
|
protected String |
segString(MergePolicy.MergeContext mergeContext,
Iterable<SegmentCommitInfo> infos)
Builds a String representation of the given SegmentCommitInfo instances
|
void |
setMaxCFSSegmentSizeMB(double v)
If a merged segment will be more than this value,
leave the segment as
non-compound file even if compound file is enabled.
|
void |
setNoCFSRatio(double noCFSRatio)
If a merged segment will be more than this percentage
of the total size of the index, leave the segment as
non-compound file even if compound file is enabled.
|
protected long |
size(SegmentCommitInfo info,
MergePolicy.MergeContext mergeContext)
Return the byte size of the provided
SegmentCommitInfo , pro-rated by percentage of
non-deleted documents is set. |
boolean |
useCompoundFile(SegmentInfos infos,
SegmentCommitInfo mergedInfo,
MergePolicy.MergeContext mergeContext)
Returns true if a new segment (regardless of its origin) should use the
compound file format.
|
protected boolean |
verbose(MergePolicy.MergeContext mergeContext)
Returns
true if the info-stream is in verbose mode |
protected static final double DEFAULT_NO_CFS_RATIO
protected static final long DEFAULT_MAX_CFS_SEGMENT_SIZE
Long.MAX_VALUE
.protected double noCFSRatio
protected long maxCFSSegmentSize
public MergePolicy()
protected MergePolicy(double defaultNoCFSRatio, long defaultMaxCFSSegmentSize)
MergePolicy
public abstract MergePolicy.MergeSpecification findMerges(MergeTrigger mergeTrigger, SegmentInfos segmentInfos, MergePolicy.MergeContext mergeContext) throws IOException
IndexWriter
calls this whenever there is a change to the segments.
This call is always synchronized on the IndexWriter
instance so
only one thread at a time will call this method.mergeTrigger
- the event that triggered the mergesegmentInfos
- the total set of segments in the indexmergeContext
- the IndexWriter to find the merges onIOException
public abstract MergePolicy.MergeSpecification findForcedMerges(SegmentInfos segmentInfos, int maxSegmentCount, Map<SegmentCommitInfo,Boolean> segmentsToMerge, MergePolicy.MergeContext mergeContext) throws IOException
<=
the specified segment count. IndexWriter
calls this when its
IndexWriter.forceMerge(int)
method is called. This call is always
synchronized on the IndexWriter
instance so only one thread at a
time will call this method.segmentInfos
- the total set of segments in the indexmaxSegmentCount
- requested maximum number of segments in the index (currently this
is always 1)segmentsToMerge
- contains the specific SegmentInfo instances that must be merged
away. This may be a subset of all
SegmentInfos. If the value is True for a
given SegmentInfo, that means this segment was
an original segment present in the
to-be-merged index; else, it was a segment
produced by a cascaded merge.mergeContext
- the MergeContext to find the merges onIOException
public abstract MergePolicy.MergeSpecification findForcedDeletesMerges(SegmentInfos segmentInfos, MergePolicy.MergeContext mergeContext) throws IOException
segmentInfos
- the total set of segments in the indexmergeContext
- the MergeContext to find the merges onIOException
public MergePolicy.MergeSpecification findFullFlushMerges(MergeTrigger mergeTrigger, SegmentInfos segmentInfos, MergePolicy.MergeContext mergeContext) throws IOException
MergePolicy
you must also set a non-zero timeout using
IndexWriterConfig.setMaxFullFlushMergeWaitMillis(long)
.
Any merges returned here will make IndexWriter.commit()
, IndexWriter.prepareCommit()
or IndexWriter.getReader(boolean, boolean)
block until
the merges complete or until LiveIndexWriterConfig.getMaxFullFlushMergeWaitMillis()
has elapsed. This may be
used to merge small segments that have just been flushed, reducing the number of segments in
the point in time snapshot. If a merge does not complete in the allotted time, it will continue to execute, and eventually finish and
apply to future point in time snapshot, but will not be reflected in the current one.
If a MergePolicy.OneMerge
in the returned MergePolicy.MergeSpecification
includes a segment already included in a registered
merge, then IndexWriter.commit()
or IndexWriter.prepareCommit()
will throw a IllegalStateException
.
Use MergePolicy.MergeContext.getMergingSegments()
to determine which segments are currently registered to merge.mergeTrigger
- the event that triggered the merge (COMMIT or GET_READER).segmentInfos
- the total set of segments in the index (while preparing the commit)mergeContext
- the MergeContext to find the merges on, which should be used to determine which segments are
already in a registered merge (see MergePolicy.MergeContext.getMergingSegments()
).IOException
public boolean useCompoundFile(SegmentInfos infos, SegmentCommitInfo mergedInfo, MergePolicy.MergeContext mergeContext) throws IOException
true
iff the size of the given mergedInfo is less or equal to
getMaxCFSSegmentSizeMB()
and the size is less or equal to the
TotalIndexSize * getNoCFSRatio()
otherwise false
.IOException
protected long size(SegmentCommitInfo info, MergePolicy.MergeContext mergeContext) throws IOException
SegmentCommitInfo
, pro-rated by percentage of
non-deleted documents is set.IOException
protected final boolean assertDelCount(int delCount, SegmentCommitInfo info)
protected final boolean isMerged(SegmentInfos infos, SegmentCommitInfo info, MergePolicy.MergeContext mergeContext) throws IOException
IOException
public double getNoCFSRatio()
noCFSRatio
.setNoCFSRatio(double)
public void setNoCFSRatio(double noCFSRatio)
public double getMaxCFSSegmentSizeMB()
public void setMaxCFSSegmentSizeMB(double v)
public boolean keepFullyDeletedSegment(IOSupplier<CodecReader> readerIOSupplier) throws IOException
IOException
public int numDeletesToMerge(SegmentCommitInfo info, int delCount, IOSupplier<CodecReader> readerSupplier) throws IOException
info
- the segment info that identifies the segmentdelCount
- the number deleted documents for this segmentreaderSupplier
- a supplier that allows to obtain a CodecReader
for this segmentIOException
IndexWriter.softUpdateDocument(Term, Iterable, Field...)
,
IndexWriterConfig.setSoftDeletesField(String)
protected final String segString(MergePolicy.MergeContext mergeContext, Iterable<SegmentCommitInfo> infos)
protected final void message(String message, MergePolicy.MergeContext mergeContext)
MergePolicy.MergeContext
's infoStream
.protected final boolean verbose(MergePolicy.MergeContext mergeContext)
true
if the info-stream is in verbose modemessage(String, MergeContext)
Copyright © 2000-2021 Apache Software Foundation. All Rights Reserved.