Class IndexShard
- All Implemented Interfaces:
IndexShardComponent
,IndicesClusterStateService.Shard
-
Nested Class Summary
Modifier and TypeClassDescriptionstatic class
Simple struct encapsulating a shard failure -
Field Summary
Fields inherited from class org.elasticsearch.index.shard.AbstractIndexShardComponent
indexSettings, logger, shardId
-
Constructor Summary
ConstructorDescriptionIndexShard(ShardRouting shardRouting, IndexSettings indexSettings, ShardPath path, Store store, Supplier<org.apache.lucene.search.Sort> indexSortSupplier, IndexCache indexCache, MapperService mapperService, SimilarityService similarityService, EngineFactory engineFactory, IndexEventListener indexEventListener, org.elasticsearch.core.CheckedFunction<org.apache.lucene.index.DirectoryReader,org.apache.lucene.index.DirectoryReader,IOException> indexReaderWrapper, ThreadPool threadPool, BigArrays bigArrays, Engine.Warmer warmer, List<SearchOperationListener> searchOperationListener, List<IndexingOperationListener> listeners, Runnable globalCheckpointSyncer, RetentionLeaseSyncer retentionLeaseSyncer, CircuitBreakerService circuitBreakerService, IndexStorePlugin.SnapshotCommitSupplier snapshotCommitSupplier)
-
Method Summary
Modifier and TypeMethodDescriptionvoid
acquireAllPrimaryOperationsPermits(ActionListener<org.elasticsearch.core.Releasable> onPermitAcquired, org.elasticsearch.core.TimeValue timeout)
Acquire all primary operation permits.void
acquireAllReplicaOperationsPermits(long opPrimaryTerm, long globalCheckpoint, long maxSeqNoOfUpdatesOrDeletes, ActionListener<org.elasticsearch.core.Releasable> onPermitAcquired, org.elasticsearch.core.TimeValue timeout)
Acquire all replica operation permits whenever the shard is ready for indexing (seeacquireAllPrimaryOperationsPermits(ActionListener, TimeValue)
.Acquires a lock on the translog files and Lucene soft-deleted documents to prevent them from being trimmedAcquires theIndexCommit
which should be included in a snapshot.acquireLastIndexCommit(boolean flushFirst)
Creates a newIndexCommit
snapshot from the currently running engine.void
acquirePrimaryOperationPermit(ActionListener<org.elasticsearch.core.Releasable> onPermitAcquired, String executorOnDelay, Object debugInfo)
Acquire a primary operation permit whenever the shard is ready for indexing.void
acquirePrimaryOperationPermit(ActionListener<org.elasticsearch.core.Releasable> onPermitAcquired, String executorOnDelay, Object debugInfo, boolean forceExecution)
void
acquireReplicaOperationPermit(long opPrimaryTerm, long globalCheckpoint, long maxSeqNoOfUpdatesOrDeletes, ActionListener<org.elasticsearch.core.Releasable> onPermitAcquired, String executorOnDelay, Object debugInfo)
Acquire a replica operation permit whenever the shard is ready for indexing (seeacquirePrimaryOperationPermit(ActionListener, String, Object)
).Snapshots the most recent safe index commit from the currently running engine.acquireSearcher(String source)
Acquires a point-in-time reader that can be used to createEngine.Searcher
s on demand.Acquires a point-in-time reader that can be used to createEngine.Searcher
s on demand.void
void
activateWithPrimaryContext(ReplicationTracker.PrimaryContext primaryContext)
Updates the known allocation IDs and the local checkpoints for the corresponding allocations from a primary relocation source.Creates aRunnable
that must be executed before the clean files step in peer recovery can complete.void
addGlobalCheckpointListener(long waitingForGlobalCheckpoint, GlobalCheckpointListeners.GlobalCheckpointListener listener, org.elasticsearch.core.TimeValue timeout)
Add a global checkpoint listener.addPeerRecoveryRetentionLease(String nodeId, long globalCheckpoint, ActionListener<ReplicationResponse> listener)
void
addRefreshListener(Translog.Location location, Consumer<Boolean> listener)
Add a listener for refreshes.addRetentionLease(String id, long retainingSequenceNumber, String source, ActionListener<ReplicationResponse> listener)
Adds a new retention lease.void
addShardFailureCallback(Consumer<IndexShard.ShardFailure> onShardFailure)
void
advanceMaxSeqNoOfUpdatesOrDeletes(long seqNo)
A replica calls this method to advance the max_seq_no_of_updates marker of its engine to at least the max_seq_no_of_updates value (piggybacked in a replication request) that it receives from its primary before executing that replication request.void
afterCleanFiles(Runnable runnable)
Execute aRunnable
on the generic pool once all dependencies added viaaddCleanFilesDependency()
have finished.void
Schedules a flush or translog generation roll if needed but will not schedule more than one concurrently.applyDeleteOperationOnPrimary(long version, String type, String id, VersionType versionType, long ifSeqNo, long ifPrimaryTerm)
applyDeleteOperationOnReplica(long seqNo, long opPrimaryTerm, long version, String type, String id)
applyIndexOperationOnPrimary(long version, VersionType versionType, SourceToParse sourceToParse, long ifSeqNo, long ifPrimaryTerm, long autoGeneratedTimestamp, boolean isRetry)
applyIndexOperationOnReplica(long seqNo, long opPrimaryTerm, long version, long autoGeneratedTimeStamp, boolean isRetry, SourceToParse sourceToParse)
applyTranslogOperation(Translog.Operation operation, Engine.Operation.Origin origin)
boolean
void
awaitShardSearchActive(Consumer<Boolean> listener)
Registers the given listener and invokes it once the shard is active again and all pending refresh translog location has been refreshed.static org.apache.lucene.analysis.Analyzer
buildIndexAnalyzer(MapperService mapperService)
void
checkIdle(long inactiveTimeNS)
Called byIndexingMemoryController
to check whether more thaninactiveTimeNS
has passed since the last indexing operation, and notify listeners that we are now inactive so e.g.cloneLocalPeerRecoveryRetentionLease(String nodeId, ActionListener<ReplicationResponse> listener)
void
completionStats(String... fields)
void
docStats()
int
estimateNumberOfHistoryOperations(String reason, Engine.HistorySource source, long startingSeqNo)
Returns the estimated number of history operations whose seq# at least the provided seq# in this shard.void
Fails the shard and marks the shard store as corrupted ife
is caused by index corruptionfieldDataStats(String... fields)
void
perform the last stages of recovery once all translog operations are done.flush(FlushRequest request)
Executes the given flush request against the engine.void
forceMerge(ForceMergeRequest forceMerge)
get(Engine.Get get)
int
Obtain the active operation count, orOPERATIONS_BLOCKED
if all permits are held (even if there are outstanding operations in flight).protected Engine
NOTE: returns null if engine is not yet started (e.g.getFailedDeleteResult(Exception e, long version)
getFailedIndexResult(Exception e, long version)
getHistoryOperations(String reason, Engine.HistorySource source, long startingSeqNo)
Creates a new history snapshot for reading operations since the provided starting seqno (inclusive).long
Returns number of heap bytes used by the indexing buffer for this shard, or 0 if the shard is closedorg.apache.lucene.search.Sort
Return the sort order of this index, or null if the index has no sort.com.carrotsearch.hppc.ObjectLongMap<String>
Get the local knowledge of the global checkpoints for all in-sync allocation IDs.long
Returns the global checkpoint for the shard.long
Returns the latest global checkpoint value that has been persisted in the underlying storage (i.e.long
Returns the persisted local checkpoint for the shard.long
Returns the maximum auto_id_timestamp of all append-only requests have been processed by this shard or the auto_id_timestamp received from the primary viaupdateMaxUnsafeAutoIdTimestamp(long)
at the beginning of a peer-recovery or a primary-replica resync.long
Returns the maximum sequence number of either update or delete operations have been processed in this shard or the sequence number fromadvanceMaxSeqNoOfUpdatesOrDeletes(long)
.long
Gets the minimum retained sequence number for this engine.long
Returns the primary term that is currently being used to assign to operationsReturns a list of retention leases for peer recovery installed in this shard copy.long
USE THIS METHOD WITH CARE! Returns the primary term the index shard is supposed to be on.Returns the pending replication actions for the shard.org.apache.lucene.search.QueryCachingPolicy
Returns the current replication group for the shard.Get all retention leases tracked on this shard.getRetentionLeases(boolean expireLeases)
If the expire leases parameter is false, gets all retention leases tracked on this shard and otherwise first calculates expiration of existing retention leases, and then gets all non-expired retention leases tracked on this shard.getStats()
Returns the current translog durability modelong
Returns how many bytes we are currently moving from heap to diskboolean
hasCompleteHistoryOperations(String reason, Engine.HistorySource source, long startingSeqNo)
Checks if we have a completed history of operations since the given starting seqno (inclusive).boolean
Returns true if this shard has some scheduled refresh that is pending because of search-idle.boolean
Returnstrue
if this shard can ignore a recovery attempt made to it (since the already doing/done it)indexingStats(String... types)
void
initiateTracking(String allocationId)
Called when the recovery process for a shard has opened the engine on the target shard.boolean
isActive()
boolean
returns true if theIndexShardState
allows readingboolean
Returns whether the shard is a relocated primary, i.e.boolean
Returns true if this shards is search idleboolean
Checks if the underlying storage sync is required.boolean
isSystem()
Loads the latest retention leases from their dedicated state file.void
markAllocationIdAsInSync(String allocationId, long localCheckpoint)
Marks the shard with the provided allocation ID as in-sync with the primary shard.markAsRecovering(String reason, RecoveryState recoveryState)
Marks the shard as recovering based on a recovery state, fails with exception is recovering is not allowed to be set.markSeqNoAsNoop(long seqNo, long opPrimaryTerm, String reason)
void
void
maybeSyncGlobalCheckpoint(String reason)
Syncs the global checkpoint to the replicas if the global checkpoint on at least one replica is behind the global checkpoint on the primary.org.apache.lucene.util.Version
newChangesSnapshot(String source, long fromSeqNo, long toSeqNo, boolean requiredFullRange)
Creates a new changes snapshot for reading operations whose seq_no are betweenfromSeqNo
(inclusive) andtoSeqNo
(inclusive).void
noopUpdate(String type)
Should be called for each no-op update operation to increment relevant statistics.void
void
opens the engine on top of the existing lucene engine and translog.void
Opens the engine on top of the existing lucene engine and translog.int
boolean
Check if there are any recoveries pending in-sync.void
called if recovery has to be restarted after network error / delay **void
Persists the current retention leases to their dedicated state file.void
postRecovery(String reason)
void
called before starting to copy index files overstatic Engine.Index
prepareIndex(MapperService mapperService, String type, SourceToParse source, long seqNo, long primaryTerm, long version, VersionType versionType, Engine.Operation.Origin origin, long autoGeneratedIdTimestamp, boolean isRetry, long ifSeqNo, long ifPrimaryTerm)
void
void
void
recoverFromLocalShards(BiConsumer<String,MappingMetadata> mappingUpdateConsumer, List<IndexShard> localShards, ActionListener<Boolean> listener)
void
recoverFromStore(ActionListener<Boolean> listener)
long
A best effort to bring up this shard to the global checkpoint using the local translog before performing a peer recovery.Returns the currentRecoveryState
if this shard is recovering or has been recovering.returns stats about ongoing recoveries, both source and targetvoid
Writes all indexing changes to disk and opens a new searcher reflecting all changes.void
relocated(String targetAllocationId, BiConsumer<ReplicationTracker.PrimaryContext,ActionListener<Void>> consumer, ActionListener<Void> listener)
Completes the relocation.void
removePeerRecoveryRetentionLease(String nodeId, ActionListener<ReplicationResponse> listener)
void
removeRetentionLease(String id, ActionListener<ReplicationResponse> listener)
Removes an existing retention lease.renewRetentionLease(String id, long retainingSequenceNumber, String source)
Renews an existing retention lease.void
If a file-based recovery occurs, a recovery target calls this method to reset the recovery stage.void
restoreFromRepository(Repository repository, ActionListener<Boolean> listener)
void
Rolls the tranlog generation and cleans unneeded.Returns the latest cluster routing entry received with this shard.void
runUnderPrimaryPermit(Runnable runnable, Consumer<Exception> onFailure, String executorOnDelay, Object debugInfo)
Runs the specified runnable under a permit and otherwise calling back the specified failure callback.boolean
Executes a scheduled refresh if necessary.searchStats(String... groups)
segments(boolean verbose)
segmentStats(boolean includeSegmentFileSizes, boolean includeUnloadedSegments)
gets aStore.MetadataSnapshot
for the current directory.void
startRecovery(RecoveryState recoveryState, PeerRecoveryTargetService recoveryTargetService, PeerRecoveryTargetService.RecoveryListener recoveryListener, RepositoriesService repositoriesService, BiConsumer<String,MappingMetadata> mappingUpdateConsumer, IndicesService indicesService)
state()
Returns the latest internal shard state.store()
void
sync()
void
sync(Translog.Location location, Consumer<Exception> syncListener)
Syncs the given location with the underlying storage unless already synced.syncFlush(String syncId, Engine.CommitId expectedCommitId)
void
Syncs the current retention leases to all replicas.void
trimOperationOfPreviousPrimaryTerms(long aboveSeqNo)
void
checks and removes translog files that no longer need to be retained.void
updateGlobalCheckpointForShard(String allocationId, long globalCheckpoint)
Update the local knowledge of the persisted global checkpoint for the specified allocation ID.void
updateGlobalCheckpointOnReplica(long globalCheckpoint, String reason)
Updates the global checkpoint on a replica shard after it has been updated by the primary.void
updateLocalCheckpointForShard(String allocationId, long checkpoint)
Notifies the service to update the local checkpoint for the shard with the provided allocation ID.void
updateMaxUnsafeAutoIdTimestamp(long maxSeenAutoIdTimestampFromPrimary)
Since operations stored in soft-deletes do not have max_auto_id_timestamp, the primary has to propagate its max_auto_id_timestamp (viagetMaxSeenAutoIdTimestamp()
of all processed append-only requests to replicas at the beginning of a peer-recovery or a primary-replica resync to force a replica to disable optimization for all append-only requests which are replicated via replication while its retry variants are replicated via recovery without auto_id_timestamp.void
updateRetentionLeasesOnReplica(RetentionLeases retentionLeases)
Updates retention leases on a replica.void
updateShardState(ShardRouting newRouting, long newPrimaryTerm, BiConsumer<IndexShard,ActionListener<PrimaryReplicaSyncer.ResyncTask>> primaryReplicaSyncer, long applyingClusterStateVersion, Set<String> inSyncAllocationIds, IndexShardRoutingTable routingTable)
Updates the shard state based on an incoming cluster state: - Updates and persists the new routing value.org.apache.lucene.util.Version
upgrade(UpgradeRequest upgrade)
Upgrades the shard to the current version of Lucene and returns the minimum segment versionboolean
protected void
void
Performs the pre-closing checks on theIndexShard
.void
Called when our shard is using too much heap and should move buffered indexed/deleted documents to disk.Methods inherited from class org.elasticsearch.index.shard.AbstractIndexShardComponent
indexSettings, shardId
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
Methods inherited from interface org.elasticsearch.indices.cluster.IndicesClusterStateService.Shard
shardId
-
Field Details
-
shardRouting
-
state
-
OPERATIONS_BLOCKED
public static final int OPERATIONS_BLOCKED- See Also:
- Constant Field Values
-
-
Constructor Details
-
IndexShard
public IndexShard(ShardRouting shardRouting, IndexSettings indexSettings, ShardPath path, Store store, Supplier<org.apache.lucene.search.Sort> indexSortSupplier, IndexCache indexCache, MapperService mapperService, SimilarityService similarityService, @Nullable EngineFactory engineFactory, IndexEventListener indexEventListener, org.elasticsearch.core.CheckedFunction<org.apache.lucene.index.DirectoryReader,org.apache.lucene.index.DirectoryReader,IOException> indexReaderWrapper, ThreadPool threadPool, BigArrays bigArrays, Engine.Warmer warmer, List<SearchOperationListener> searchOperationListener, List<IndexingOperationListener> listeners, Runnable globalCheckpointSyncer, RetentionLeaseSyncer retentionLeaseSyncer, CircuitBreakerService circuitBreakerService, IndexStorePlugin.SnapshotCommitSupplier snapshotCommitSupplier) throws IOException- Throws:
IOException
-
-
Method Details
-
getThreadPool
-
store
-
getIndexSort
public org.apache.lucene.search.Sort getIndexSort()Return the sort order of this index, or null if the index has no sort. -
getService
-
shardBitsetFilterCache
-
mapperService
-
getSearchOperationListener
-
warmerService
-
requestCache
-
fieldData
-
isSystem
public boolean isSystem() -
getPendingPrimaryTerm
public long getPendingPrimaryTerm()USE THIS METHOD WITH CARE! Returns the primary term the index shard is supposed to be on. In case of primary promotion or when a replica learns about a new term due to a new primary, the term that's exposed here will not be the term that the shard internally uses to assign to operations. The shard will auto-correct its internal operation term, but this might take time. SeeIndexMetadata.primaryTerm(int)
-
getOperationPrimaryTerm
public long getOperationPrimaryTerm()Returns the primary term that is currently being used to assign to operations -
routingEntry
Returns the latest cluster routing entry received with this shard.- Specified by:
routingEntry
in interfaceIndicesClusterStateService.Shard
-
getQueryCachingPolicy
public org.apache.lucene.search.QueryCachingPolicy getQueryCachingPolicy() -
updateShardState
public void updateShardState(ShardRouting newRouting, long newPrimaryTerm, BiConsumer<IndexShard,ActionListener<PrimaryReplicaSyncer.ResyncTask>> primaryReplicaSyncer, long applyingClusterStateVersion, Set<String> inSyncAllocationIds, IndexShardRoutingTable routingTable) throws IOExceptionDescription copied from interface:IndicesClusterStateService.Shard
Updates the shard state based on an incoming cluster state: - Updates and persists the new routing value. - Updates the primary term if this shard is a primary. - Updates the allocation ids that are tracked by the shard if it is a primary. SeeReplicationTracker.updateFromMaster(long, Set, IndexShardRoutingTable)
for details.- Specified by:
updateShardState
in interfaceIndicesClusterStateService.Shard
- Parameters:
newRouting
- the new routing entrynewPrimaryTerm
- the new primary termprimaryReplicaSyncer
- the primary-replica resync action to trigger when a term is increased on a primaryapplyingClusterStateVersion
- the cluster state version being applied when updating the allocation IDs from the masterinSyncAllocationIds
- the allocation ids of the currently in-sync shard copiesroutingTable
- the shard routing table- Throws:
IOException
- if shard state could not be persisted
-
markAsRecovering
public IndexShardState markAsRecovering(String reason, RecoveryState recoveryState) throws IndexShardStartedException, IndexShardRelocatedException, IndexShardRecoveringException, IndexShardClosedExceptionMarks the shard as recovering based on a recovery state, fails with exception is recovering is not allowed to be set. -
relocated
public void relocated(String targetAllocationId, BiConsumer<ReplicationTracker.PrimaryContext,ActionListener<Void>> consumer, ActionListener<Void> listener) throws IllegalIndexShardStateException, IllegalStateExceptionCompletes the relocation. Operations are blocked and current operations are drained before changing state to relocated. The providedBiConsumer
is executed after all operations are successfully blocked.- Parameters:
consumer
- aBiConsumer
that is executed after operations are blocked and that consumes the primary context as well as a listener to resolve once it finishedlistener
- listener to resolve once this method actions including executingconsumer
in the non-failure case complete- Throws:
IllegalIndexShardStateException
- if the shard is not relocating due to concurrent cancellationIllegalStateException
- if the relocation target is no longer part of the replication group
-
state
Description copied from interface:IndicesClusterStateService.Shard
Returns the latest internal shard state.- Specified by:
state
in interfaceIndicesClusterStateService.Shard
-
applyIndexOperationOnPrimary
public Engine.IndexResult applyIndexOperationOnPrimary(long version, VersionType versionType, SourceToParse sourceToParse, long ifSeqNo, long ifPrimaryTerm, long autoGeneratedTimestamp, boolean isRetry) throws IOException- Throws:
IOException
-
applyIndexOperationOnReplica
public Engine.IndexResult applyIndexOperationOnReplica(long seqNo, long opPrimaryTerm, long version, long autoGeneratedTimeStamp, boolean isRetry, SourceToParse sourceToParse) throws IOException- Throws:
IOException
-
prepareIndex
public static Engine.Index prepareIndex(MapperService mapperService, String type, SourceToParse source, long seqNo, long primaryTerm, long version, VersionType versionType, Engine.Operation.Origin origin, long autoGeneratedIdTimestamp, boolean isRetry, long ifSeqNo, long ifPrimaryTerm) -
markSeqNoAsNoop
public Engine.NoOpResult markSeqNoAsNoop(long seqNo, long opPrimaryTerm, String reason) throws IOException- Throws:
IOException
-
getFailedIndexResult
-
getFailedDeleteResult
-
applyDeleteOperationOnPrimary
public Engine.DeleteResult applyDeleteOperationOnPrimary(long version, String type, String id, VersionType versionType, long ifSeqNo, long ifPrimaryTerm) throws IOException- Throws:
IOException
-
applyDeleteOperationOnReplica
public Engine.DeleteResult applyDeleteOperationOnReplica(long seqNo, long opPrimaryTerm, long version, String type, String id) throws IOException- Throws:
IOException
-
get
-
refresh
Writes all indexing changes to disk and opens a new searcher reflecting all changes. This can throwAlreadyClosedException
. -
getWritingBytes
public long getWritingBytes()Returns how many bytes we are currently moving from heap to disk -
refreshStats
-
flushStats
-
docStats
-
commitStats
- Returns:
CommitStats
- Throws:
org.apache.lucene.store.AlreadyClosedException
- if shard is closed
-
seqNoStats
- Returns:
SeqNoStats
- Throws:
org.apache.lucene.store.AlreadyClosedException
- if shard is closed
-
indexingStats
-
searchStats
-
getStats
-
storeStats
-
mergeStats
-
segmentStats
-
warmerStats
-
fieldDataStats
-
translogStats
-
completionStats
-
syncFlush
-
flush
Executes the given flush request against the engine.- Parameters:
request
- the flush request- Returns:
- the commit ID
-
trimTranslog
public void trimTranslog()checks and removes translog files that no longer need to be retained. SeeTranslogDeletionPolicy
for details -
rollTranslogGeneration
public void rollTranslogGeneration()Rolls the tranlog generation and cleans unneeded. -
forceMerge
- Throws:
IOException
-
upgrade
Upgrades the shard to the current version of Lucene and returns the minimum segment version- Throws:
IOException
-
minimumCompatibleVersion
public org.apache.lucene.util.Version minimumCompatibleVersion() -
acquireLastIndexCommit
Creates a newIndexCommit
snapshot from the currently running engine. All resources referenced by this commit won't be freed until the commit / snapshot is closed.- Parameters:
flushFirst
-true
if the index should first be flushed to disk / a low level lucene commit should be executed- Throws:
EngineException
-
acquireIndexCommitForSnapshot
Acquires theIndexCommit
which should be included in a snapshot.- Throws:
EngineException
-
acquireSafeIndexCommit
Snapshots the most recent safe index commit from the currently running engine. All index files referenced by this index commit won't be freed until the commit/snapshot is closed.- Throws:
EngineException
-
snapshotStoreMetadata
gets aStore.MetadataSnapshot
for the current directory. This method is safe to call in all lifecycle of the index shard, without having to worry about the current state of the engine and concurrent flushes.- Throws:
org.apache.lucene.index.IndexNotFoundException
- if no index is found in the current directoryorg.apache.lucene.index.CorruptIndexException
- if the lucene index is corrupted. This can be caused by a checksum mismatch or an unexpected exception when opening the index reading the segments file.org.apache.lucene.index.IndexFormatTooOldException
- if the lucene index is too old to be opened.org.apache.lucene.index.IndexFormatTooNewException
- if the lucene index is too new to be opened.FileNotFoundException
- if one or more files referenced by a commit are not present.NoSuchFileException
- if one or more files referenced by a commit are not present.IOException
-
failShard
Fails the shard and marks the shard store as corrupted ife
is caused by index corruption -
acquireSearcherSupplier
Acquires a point-in-time reader that can be used to createEngine.Searcher
s on demand. -
acquireSearcherSupplier
Acquires a point-in-time reader that can be used to createEngine.Searcher
s on demand. -
acquireSearcher
-
close
- Throws:
IOException
-
preRecovery
public void preRecovery() -
postRecovery
public void postRecovery(String reason) throws IndexShardStartedException, IndexShardRelocatedException, IndexShardClosedException -
prepareForIndexRecovery
public void prepareForIndexRecovery()called before starting to copy index files over -
recoverLocallyUpToGlobalCheckpoint
public long recoverLocallyUpToGlobalCheckpoint()A best effort to bring up this shard to the global checkpoint using the local translog before performing a peer recovery.- Returns:
- a sequence number that an operation-based peer recovery can start with. This is the first operation after the local checkpoint of the safe commit if exists.
-
trimOperationOfPreviousPrimaryTerms
public void trimOperationOfPreviousPrimaryTerms(long aboveSeqNo) -
getMaxSeenAutoIdTimestamp
public long getMaxSeenAutoIdTimestamp()Returns the maximum auto_id_timestamp of all append-only requests have been processed by this shard or the auto_id_timestamp received from the primary viaupdateMaxUnsafeAutoIdTimestamp(long)
at the beginning of a peer-recovery or a primary-replica resync.- See Also:
updateMaxUnsafeAutoIdTimestamp(long)
-
updateMaxUnsafeAutoIdTimestamp
public void updateMaxUnsafeAutoIdTimestamp(long maxSeenAutoIdTimestampFromPrimary)Since operations stored in soft-deletes do not have max_auto_id_timestamp, the primary has to propagate its max_auto_id_timestamp (viagetMaxSeenAutoIdTimestamp()
of all processed append-only requests to replicas at the beginning of a peer-recovery or a primary-replica resync to force a replica to disable optimization for all append-only requests which are replicated via replication while its retry variants are replicated via recovery without auto_id_timestamp.Without this force-update, a replica can generate duplicate documents (for the same id) if it first receives a retry append-only (without timestamp) via recovery, then an original append-only (with timestamp) via replication.
-
applyTranslogOperation
public Engine.Result applyTranslogOperation(Translog.Operation operation, Engine.Operation.Origin origin) throws IOException- Throws:
IOException
-
openEngineAndRecoverFromTranslog
opens the engine on top of the existing lucene engine and translog. Operations from the translog will be replayed to bring lucene up to date.- Throws:
IOException
-
openEngineAndSkipTranslogRecovery
Opens the engine on top of the existing lucene engine and translog. The translog is kept but its operations won't be replayed.- Throws:
IOException
-
performRecoveryRestart
called if recovery has to be restarted after network error / delay **- Throws:
IOException
-
resetRecoveryStage
public void resetRecoveryStage()If a file-based recovery occurs, a recovery target calls this method to reset the recovery stage. -
recoveryStats
returns stats about ongoing recoveries, both source and target -
recoveryState
Returns the currentRecoveryState
if this shard is recovering or has been recovering. Returns null if the recovery has not yet started or shard was not recovered (created via an API).- Specified by:
recoveryState
in interfaceIndicesClusterStateService.Shard
-
getTimestampRange
- Specified by:
getTimestampRange
in interfaceIndicesClusterStateService.Shard
- Returns:
- the range of the
@timestamp
field for this shard, orShardLongFieldRange.EMPTY
if this field is not found, orShardLongFieldRange.UNKNOWN
if its range is not fixed.
-
finalizeRecovery
public void finalizeRecovery()perform the last stages of recovery once all translog operations are done. note that you should still callpostRecovery(String)
. -
ignoreRecoveryAttempt
public boolean ignoreRecoveryAttempt()Returnstrue
if this shard can ignore a recovery attempt made to it (since the already doing/done it) -
readAllowed
- Throws:
IllegalIndexShardStateException
-
isReadAllowed
public boolean isReadAllowed()returns true if theIndexShardState
allows reading -
verifyActive
- Throws:
IllegalIndexShardStateException
-
getIndexBufferRAMBytesUsed
public long getIndexBufferRAMBytesUsed()Returns number of heap bytes used by the indexing buffer for this shard, or 0 if the shard is closed -
addShardFailureCallback
-
checkIdle
public void checkIdle(long inactiveTimeNS)Called byIndexingMemoryController
to check whether more thaninactiveTimeNS
has passed since the last indexing operation, and notify listeners that we are now inactive so e.g. sync'd flush can happen. -
isActive
public boolean isActive() -
shardPath
-
recoverFromLocalShards
public void recoverFromLocalShards(BiConsumer<String,MappingMetadata> mappingUpdateConsumer, List<IndexShard> localShards, ActionListener<Boolean> listener) throws IOException- Throws:
IOException
-
recoverFromStore
-
restoreFromRepository
-
onSettingsChanged
public void onSettingsChanged() -
acquireHistoryRetentionLock
Acquires a lock on the translog files and Lucene soft-deleted documents to prevent them from being trimmed -
estimateNumberOfHistoryOperations
public int estimateNumberOfHistoryOperations(String reason, Engine.HistorySource source, long startingSeqNo) throws IOExceptionReturns the estimated number of history operations whose seq# at least the provided seq# in this shard.- Throws:
IOException
-
getHistoryOperations
public Translog.Snapshot getHistoryOperations(String reason, Engine.HistorySource source, long startingSeqNo) throws IOExceptionCreates a new history snapshot for reading operations since the provided starting seqno (inclusive). The returned snapshot can be retrieved from either Lucene index or translog files.- Throws:
IOException
-
hasCompleteHistoryOperations
public boolean hasCompleteHistoryOperations(String reason, Engine.HistorySource source, long startingSeqNo) throws IOExceptionChecks if we have a completed history of operations since the given starting seqno (inclusive). This method should be called after acquiring the retention lock; SeeacquireHistoryRetentionLock(Engine.HistorySource)
- Throws:
IOException
-
getMinRetainedSeqNo
public long getMinRetainedSeqNo()Gets the minimum retained sequence number for this engine.- Returns:
- the minimum retained sequence number
-
newChangesSnapshot
public Translog.Snapshot newChangesSnapshot(String source, long fromSeqNo, long toSeqNo, boolean requiredFullRange) throws IOExceptionCreates a new changes snapshot for reading operations whose seq_no are betweenfromSeqNo
(inclusive) andtoSeqNo
(inclusive). The caller has to close the returned snapshot after finishing the reading.- Parameters:
source
- the source of the requestfromSeqNo
- the from seq_no (inclusive) to readtoSeqNo
- the to seq_no (inclusive) to readrequiredFullRange
- iftrue
thenTranslog.Snapshot.next()
will throwIllegalStateException
if any operation betweenfromSeqNo
andtoSeqNo
is missing. This parameter should be only enabled when the entire requesting range is below the global checkpoint.- Throws:
IOException
-
segments
-
getHistoryUUID
-
getIndexEventListener
-
activateThrottling
public void activateThrottling() -
deactivateThrottling
public void deactivateThrottling() -
writeIndexingBuffer
public void writeIndexingBuffer()Called when our shard is using too much heap and should move buffered indexed/deleted documents to disk. -
updateLocalCheckpointForShard
Notifies the service to update the local checkpoint for the shard with the provided allocation ID. SeeReplicationTracker.updateLocalCheckpoint(String, long)
for details.- Parameters:
allocationId
- the allocation ID of the shard to update the local checkpoint forcheckpoint
- the local checkpoint for the shard
-
updateGlobalCheckpointForShard
Update the local knowledge of the persisted global checkpoint for the specified allocation ID.- Parameters:
allocationId
- the allocation ID to update the global checkpoint forglobalCheckpoint
- the global checkpoint
-
addGlobalCheckpointListener
public void addGlobalCheckpointListener(long waitingForGlobalCheckpoint, GlobalCheckpointListeners.GlobalCheckpointListener listener, org.elasticsearch.core.TimeValue timeout)Add a global checkpoint listener. If the global checkpoint is equal to or above the global checkpoint the listener is waiting for, then the listener will be notified immediately via an executor (so possibly not on the current thread). If the specified timeout elapses before the listener is notified, the listener will be notified with anTimeoutException
. A caller may pass null to specify no timeout.- Parameters:
waitingForGlobalCheckpoint
- the global checkpoint the listener is waiting forlistener
- the listenertimeout
- the timeout
-
getRetentionLeases
Get all retention leases tracked on this shard.- Returns:
- the retention leases
-
getRetentionLeases
If the expire leases parameter is false, gets all retention leases tracked on this shard and otherwise first calculates expiration of existing retention leases, and then gets all non-expired retention leases tracked on this shard. Note that only the primary shard calculates which leases are expired, and if any have expired, syncs the retention leases to any replicas. If the expire leases parameter is true, this replication tracker must be in primary mode.- Returns:
- the non-expired retention leases
-
getRetentionLeaseStats
-
addRetentionLease
public RetentionLease addRetentionLease(String id, long retainingSequenceNumber, String source, ActionListener<ReplicationResponse> listener)Adds a new retention lease.- Parameters:
id
- the identifier of the retention leaseretainingSequenceNumber
- the retaining sequence numbersource
- the source of the retention leaselistener
- the callback when the retention lease is successfully added and synced to replicas- Returns:
- the new retention lease
- Throws:
IllegalArgumentException
- if the specified retention lease already exists
-
renewRetentionLease
Renews an existing retention lease.- Parameters:
id
- the identifier of the retention leaseretainingSequenceNumber
- the retaining sequence numbersource
- the source of the retention lease- Returns:
- the renewed retention lease
- Throws:
IllegalArgumentException
- if the specified retention lease does not exist
-
removeRetentionLease
Removes an existing retention lease.- Parameters:
id
- the identifier of the retention leaselistener
- the callback when the retention lease is successfully removed and synced to replicas
-
updateRetentionLeasesOnReplica
Updates retention leases on a replica.- Parameters:
retentionLeases
- the retention leases
-
loadRetentionLeases
Loads the latest retention leases from their dedicated state file.- Returns:
- the retention leases
- Throws:
IOException
- if an I/O exception occurs reading the retention leases
-
persistRetentionLeases
Persists the current retention leases to their dedicated state file.- Throws:
WriteStateException
- if an exception occurs writing the state file
-
assertRetentionLeasesPersisted
- Throws:
IOException
-
syncRetentionLeases
public void syncRetentionLeases()Syncs the current retention leases to all replicas. -
initiateTracking
Called when the recovery process for a shard has opened the engine on the target shard. Ensures that the right data structures have been set up locally to track local checkpoint information for the shard and that the shard is added to the replication group.- Parameters:
allocationId
- the allocation ID of the shard for which recovery was initiated
-
markAllocationIdAsInSync
public void markAllocationIdAsInSync(String allocationId, long localCheckpoint) throws InterruptedExceptionMarks the shard with the provided allocation ID as in-sync with the primary shard. SeeReplicationTracker.markAllocationIdAsInSync(String, long)
for additional details.- Parameters:
allocationId
- the allocation ID of the shard to mark as in-synclocalCheckpoint
- the current local checkpoint on the shard- Throws:
InterruptedException
-
getLocalCheckpoint
public long getLocalCheckpoint()Returns the persisted local checkpoint for the shard.- Returns:
- the local checkpoint
-
getLastKnownGlobalCheckpoint
public long getLastKnownGlobalCheckpoint()Returns the global checkpoint for the shard.- Returns:
- the global checkpoint
-
getLastSyncedGlobalCheckpoint
public long getLastSyncedGlobalCheckpoint()Returns the latest global checkpoint value that has been persisted in the underlying storage (i.e. translog's checkpoint) -
getInSyncGlobalCheckpoints
Get the local knowledge of the global checkpoints for all in-sync allocation IDs.- Returns:
- a map from allocation ID to the local knowledge of the global checkpoint for that allocation ID
-
maybeSyncGlobalCheckpoint
Syncs the global checkpoint to the replicas if the global checkpoint on at least one replica is behind the global checkpoint on the primary. -
getReplicationGroup
Returns the current replication group for the shard.- Returns:
- the replication group
-
getPendingReplicationActions
Returns the pending replication actions for the shard.- Returns:
- the pending replication actions
-
updateGlobalCheckpointOnReplica
Updates the global checkpoint on a replica shard after it has been updated by the primary.- Parameters:
globalCheckpoint
- the global checkpointreason
- the reason the global checkpoint was updated
-
addCleanFilesDependency
Creates aRunnable
that must be executed before the clean files step in peer recovery can complete.- Returns:
- runnable that must be executed during the clean files step in peer recovery
-
afterCleanFiles
Execute aRunnable
on the generic pool once all dependencies added viaaddCleanFilesDependency()
have finished. If there are no dependencies to wait for then theRunnable
will be executed on the calling thread. -
outstandingCleanFilesConditions
public int outstandingCleanFilesConditions() -
activateWithPrimaryContext
Updates the known allocation IDs and the local checkpoints for the corresponding allocations from a primary relocation source.- Parameters:
primaryContext
- the sequence number context
-
pendingInSync
public boolean pendingInSync()Check if there are any recoveries pending in-sync.- Returns:
true
if there is at least one shard pending in-sync, otherwise false
-
noopUpdate
Should be called for each no-op update operation to increment relevant statistics.- Parameters:
type
- the doc type of the update
-
maybeCheckIndex
public void maybeCheckIndex() -
getEngineOrNull
NOTE: returns null if engine is not yet started (e.g. recovery phase 1, copying over index files, is still running), or if engine is closed. -
startRecovery
public void startRecovery(RecoveryState recoveryState, PeerRecoveryTargetService recoveryTargetService, PeerRecoveryTargetService.RecoveryListener recoveryListener, RepositoriesService repositoriesService, BiConsumer<String,MappingMetadata> mappingUpdateConsumer, IndicesService indicesService) -
isRelocatedPrimary
public boolean isRelocatedPrimary()Returns whether the shard is a relocated primary, i.e. not in charge anymore of replicating changes (seeReplicationTracker
). -
addPeerRecoveryRetentionLease
public RetentionLease addPeerRecoveryRetentionLease(String nodeId, long globalCheckpoint, ActionListener<ReplicationResponse> listener) -
cloneLocalPeerRecoveryRetentionLease
public RetentionLease cloneLocalPeerRecoveryRetentionLease(String nodeId, ActionListener<ReplicationResponse> listener) -
removePeerRecoveryRetentionLease
public void removePeerRecoveryRetentionLease(String nodeId, ActionListener<ReplicationResponse> listener) -
getPeerRecoveryRetentionLeases
Returns a list of retention leases for peer recovery installed in this shard copy. -
useRetentionLeasesInPeerRecovery
public boolean useRetentionLeasesInPeerRecovery() -
buildIndexAnalyzer
-
acquirePrimaryOperationPermit
public void acquirePrimaryOperationPermit(ActionListener<org.elasticsearch.core.Releasable> onPermitAcquired, String executorOnDelay, Object debugInfo)Acquire a primary operation permit whenever the shard is ready for indexing. If a permit is directly available, the provided ActionListener will be called on the calling thread. During relocation hand-off, permit acquisition can be delayed. The provided ActionListener will then be called using the provided executor.- Parameters:
debugInfo
- an extra information that can be useful when tracing an unreleased permit. When assertions are enabled the tracing will capture the supplied object'sObject.toString()
value. Otherwise the object isn't used
-
acquirePrimaryOperationPermit
public void acquirePrimaryOperationPermit(ActionListener<org.elasticsearch.core.Releasable> onPermitAcquired, String executorOnDelay, Object debugInfo, boolean forceExecution) -
acquireAllPrimaryOperationsPermits
public void acquireAllPrimaryOperationsPermits(ActionListener<org.elasticsearch.core.Releasable> onPermitAcquired, org.elasticsearch.core.TimeValue timeout)Acquire all primary operation permits. Once all permits are acquired, the provided ActionListener is called. It is the responsibility of the caller to close theReleasable
. -
runUnderPrimaryPermit
public void runUnderPrimaryPermit(Runnable runnable, Consumer<Exception> onFailure, String executorOnDelay, Object debugInfo)Runs the specified runnable under a permit and otherwise calling back the specified failure callback. This method is really a convenience foracquirePrimaryOperationPermit(ActionListener, String, Object)
where the listener equates to try-with-resources closing the releasable after executing the runnable on successfully acquiring the permit, an otherwise calling back the failure callback.- Parameters:
runnable
- the runnable to execute under permitonFailure
- the callback on failureexecutorOnDelay
- the executor to execute the runnable on if permit acquisition is blockeddebugInfo
- debug info
-
acquireReplicaOperationPermit
public void acquireReplicaOperationPermit(long opPrimaryTerm, long globalCheckpoint, long maxSeqNoOfUpdatesOrDeletes, ActionListener<org.elasticsearch.core.Releasable> onPermitAcquired, String executorOnDelay, Object debugInfo)Acquire a replica operation permit whenever the shard is ready for indexing (seeacquirePrimaryOperationPermit(ActionListener, String, Object)
). If the given primary term is lower than then one inshardRouting
, theActionListener.onFailure(Exception)
method of the provided listener is invoked with anIllegalStateException
. If permit acquisition is delayed, the listener will be invoked on the executor with the specified name.- Parameters:
opPrimaryTerm
- the operation primary termglobalCheckpoint
- the global checkpoint associated with the requestmaxSeqNoOfUpdatesOrDeletes
- the max seq_no of updates (index operations overwrite Lucene) or deletes captured on the primary after this replication request was executed on it (seegetMaxSeqNoOfUpdatesOrDeletes()
onPermitAcquired
- the listener for permit acquisitionexecutorOnDelay
- the name of the executor to invoke the listener on if permit acquisition is delayeddebugInfo
- an extra information that can be useful when tracing an unreleased permit. When assertions are enabled the tracing will capture the supplied object'sObject.toString()
value. Otherwise the object isn't used
-
acquireAllReplicaOperationsPermits
public void acquireAllReplicaOperationsPermits(long opPrimaryTerm, long globalCheckpoint, long maxSeqNoOfUpdatesOrDeletes, ActionListener<org.elasticsearch.core.Releasable> onPermitAcquired, org.elasticsearch.core.TimeValue timeout)Acquire all replica operation permits whenever the shard is ready for indexing (seeacquireAllPrimaryOperationsPermits(ActionListener, TimeValue)
. If the given primary term is lower than then one inshardRouting
, theActionListener.onFailure(Exception)
method of the provided listener is invoked with anIllegalStateException
.- Parameters:
opPrimaryTerm
- the operation primary termglobalCheckpoint
- the global checkpoint associated with the requestmaxSeqNoOfUpdatesOrDeletes
- the max seq_no of updates (index operations overwrite Lucene) or deletes captured on the primary after this replication request was executed on it (seegetMaxSeqNoOfUpdatesOrDeletes()
onPermitAcquired
- the listener for permit acquisitiontimeout
- the maximum time to wait for the in-flight operations block
-
getActiveOperationsCount
public int getActiveOperationsCount()Obtain the active operation count, orOPERATIONS_BLOCKED
if all permits are held (even if there are outstanding operations in flight).- Returns:
- the active operation count, or
OPERATIONS_BLOCKED
when all permits are held.
-
getActiveOperations
- Returns:
- a list of describing each permit that wasn't released yet. The description consist of the debugInfo supplied when the permit was acquired plus a stack traces that was captured when the permit was request.
-
sync
Syncs the given location with the underlying storage unless already synced. This method might return immediately without actually fsyncing the location until the sync listener is called. Yet, unless there is already another thread fsyncing the transaction log the caller thread will be hijacked to run the fsync for all pending fsync operations. This method allows indexing threads to continue indexing without blocking on fsync calls. We ensure that there is only one thread blocking on the sync an all others can continue indexing. NOTE: if the syncListener throws an exception when it's processed the exception will only be logged. Users should make sure that the listener handles all exception cases internally. -
sync
- Throws:
IOException
-
isSyncNeeded
public boolean isSyncNeeded()Checks if the underlying storage sync is required. -
getTranslogDurability
Returns the current translog durability mode -
afterWriteOperation
public void afterWriteOperation()Schedules a flush or translog generation roll if needed but will not schedule more than one concurrently. The operation will be executed asynchronously on the flush thread pool. -
scheduledRefresh
public boolean scheduledRefresh()Executes a scheduled refresh if necessary.- Returns:
true
iff the engine got refreshed otherwisefalse
-
isSearchIdle
public final boolean isSearchIdle()Returns true if this shards is search idle -
hasRefreshPending
public final boolean hasRefreshPending()Returns true if this shard has some scheduled refresh that is pending because of search-idle. -
awaitShardSearchActive
Registers the given listener and invokes it once the shard is active again and all pending refresh translog location has been refreshed. If there is no pending refresh location registered the listener will be invoked immediately.- Parameters:
listener
- the listener to invoke once the pending refresh location is visible. The listener will be called withtrue
if the listener was registered to wait for a refresh.
-
addRefreshListener
Add a listener for refreshes.- Parameters:
location
- the location to listen forlistener
- for the refresh. Called with true if registering the listener ran it out of slots and forced a refresh. Called with false otherwise.
-
getMaxSeqNoOfUpdatesOrDeletes
public long getMaxSeqNoOfUpdatesOrDeletes()Returns the maximum sequence number of either update or delete operations have been processed in this shard or the sequence number fromadvanceMaxSeqNoOfUpdatesOrDeletes(long)
. An index request is considered as an update operation if it overwrites the existing documents in Lucene index with the same document id.The primary captures this value after executes a replication request, then transfers it to a replica before executing that replication request on a replica.
-
advanceMaxSeqNoOfUpdatesOrDeletes
public void advanceMaxSeqNoOfUpdatesOrDeletes(long seqNo)A replica calls this method to advance the max_seq_no_of_updates marker of its engine to at least the max_seq_no_of_updates value (piggybacked in a replication request) that it receives from its primary before executing that replication request. The receiving value is at least as high as the max_seq_no_of_updates on the primary was when any of the operations of that replication request were processed on it.A replica shard also calls this method to bootstrap the max_seq_no_of_updates marker with the value that it received from the primary in peer-recovery, before it replays remote translog operations from the primary. The receiving value is at least as high as the max_seq_no_of_updates on the primary was when any of these operations were processed on it.
These transfers guarantee that every index/delete operation when executing on a replica engine will observe this marker a value which is at least the value of the max_seq_no_of_updates marker on the primary after that operation was executed on the primary.
-
verifyShardBeforeIndexClosing
Performs the pre-closing checks on theIndexShard
.- Throws:
IllegalStateException
- if the sanity checks failed
-