Package org.elasticsearch.index.seqno
Class ReplicationTracker
java.lang.Object
org.elasticsearch.index.shard.AbstractIndexShardComponent
org.elasticsearch.index.seqno.ReplicationTracker
- All Implemented Interfaces:
LongSupplier
,IndexShardComponent
This class is responsible for tracking the replication group with its progress and safety markers (local and global checkpoints).
The global checkpoint is the highest sequence number for which all lower (or equal) sequence number have been processed
on all shards that are currently active. Since shards count as "active" when the master starts
them, and before this primary shard has been notified of this fact, we also include shards that have completed recovery. These shards
have received all old operations via the recovery mechanism and are kept up to date by the various replications actions. The set of
shards that are taken into account for the global checkpoint calculation are called the "in-sync shards".
The global checkpoint is maintained by the primary shard and is replicated to all the replicas (via GlobalCheckpointSyncAction
).
-
Nested Class Summary
Modifier and TypeClassDescriptionstatic class
static class
Represents the sequence number component of the primary context. -
Field Summary
Modifier and TypeFieldDescriptionstatic String
Source for peer recovery retention leases; seeaddPeerRecoveryRetentionLease(java.lang.String, long, org.elasticsearch.action.ActionListener<org.elasticsearch.action.support.replication.ReplicationResponse>)
.Fields inherited from class org.elasticsearch.index.shard.AbstractIndexShardComponent
indexSettings, logger, shardId
-
Constructor Summary
ConstructorDescriptionReplicationTracker(ShardId shardId, String allocationId, IndexSettings indexSettings, long operationPrimaryTerm, long globalCheckpoint, LongConsumer onGlobalCheckpointUpdated, LongSupplier currentTimeMillisSupplier, BiConsumer<RetentionLeases,ActionListener<ReplicationResponse>> onSyncRetentionLeases, Supplier<SafeCommitInfo> safeCommitInfoSupplier)
ReplicationTracker(ShardId shardId, String allocationId, IndexSettings indexSettings, long operationPrimaryTerm, long globalCheckpoint, LongConsumer onGlobalCheckpointUpdated, LongSupplier currentTimeMillisSupplier, BiConsumer<RetentionLeases,ActionListener<ReplicationResponse>> onSyncRetentionLeases, Supplier<SafeCommitInfo> safeCommitInfoSupplier, Consumer<ReplicationGroup> onReplicationGroupUpdated)
Initialize the global checkpoint service. -
Method Summary
Modifier and TypeMethodDescriptionvoid
Fails a relocation handoff attempt.void
activatePrimaryMode(long localCheckpoint)
Initializes the global checkpoint tracker in primary mode (seeprimaryMode
.void
activateWithPrimaryContext(ReplicationTracker.PrimaryContext primaryContext)
Activates the global checkpoint tracker in primary mode (seeprimaryMode
.addPeerRecoveryRetentionLease(String nodeId, long globalCheckpoint, ActionListener<ReplicationResponse> listener)
Retention leases for peer recovery have sourcePEER_RECOVERY_RETENTION_LEASE_SOURCE
, a lease ID containing the persistent node ID calculated bygetPeerRecoveryRetentionLeaseId(java.lang.String)
, and retain operations with sequence numbers strictly greater than the given global checkpoint.addRetentionLease(String id, long retainingSequenceNumber, String source, ActionListener<ReplicationResponse> listener)
Adds a new retention lease.boolean
cloneLocalPeerRecoveryRetentionLease(String nodeId, ActionListener<ReplicationResponse> listener)
void
Marks a relocation handoff attempt as successful.void
Create any required peer-recovery retention leases that do not currently exist because we just did a rolling upgrade from a version prior toVersion.V_7_4_0
that does not create peer-recovery retention leases.long
long
Returns the in-memory global checkpoint for the shard.com.carrotsearch.hppc.ObjectLongMap<String>
Get the local knowledge of the persisted global checkpoints for all in-sync allocation IDs.long
Returns the current operation primary term.static String
Id for a peer recovery retention lease for the given node.static String
getPeerRecoveryRetentionLeaseId(ShardRouting shardRouting)
Id for a peer recovery retention lease for the givenShardRouting
.Returns a list of peer recovery retention leases installed in this replication groupReturns the current replication group for the shard.Get all retention leases tracked on this shard.getRetentionLeases(boolean expireLeases)
If the expire leases parameter is false, gets all retention leases tracked on this shard and otherwise first calculates expiration of existing retention leases, and then gets all non-expired retention leases tracked on this shard.getTrackedLocalCheckpointForShard(String allocationId)
Returns the local checkpoint information tracked for a specific shard.boolean
void
initiateTracking(String allocationId)
Called when the recovery process for a shard has opened the engine on the target shard.boolean
Returns whether the replication tracker is in primary mode, i.e., whether the current shard is acting as primary from the point of view of replication.boolean
Returns whether the replication tracker has relocated away to another shard copy.loadRetentionLeases(Path path)
Loads the latest retention leases from their dedicated state file.void
markAllocationIdAsInSync(String allocationId, long localCheckpoint)
Marks the shard with the provided allocation ID as in-sync with the primary shard.boolean
Whether there are shards blocking global checkpoint advancement.void
persistRetentionLeases(Path path)
Persists the current retention leases to their dedicated state file.void
removePeerRecoveryRetentionLease(String nodeId, ActionListener<ReplicationResponse> listener)
void
removeRetentionLease(String id, ActionListener<ReplicationResponse> listener)
Removes an existing retention lease.void
Advance the peer-recovery retention leases for all assigned shard copies to discard history below the corresponding global checkpoint, and renew any leases that are approaching expiry.renewRetentionLease(String id, long retainingSequenceNumber, String source)
Renews an existing retention lease.void
setOperationPrimaryTerm(long operationPrimaryTerm)
Sets the current operation primary term.startRelocationHandoff(String targetAllocationId)
Initiates a relocation handoff and returns the corresponding primary context.void
updateFromMaster(long applyingClusterStateVersion, Set<String> inSyncAllocationIds, IndexShardRoutingTable routingTable)
Notifies the tracker of the current allocation IDs in the cluster state.void
updateGlobalCheckpointForShard(String allocationId, long globalCheckpoint)
Update the local knowledge of the persisted global checkpoint for the specified allocation ID.void
updateGlobalCheckpointOnReplica(long newGlobalCheckpoint, String reason)
Updates the global checkpoint on a replica shard after it has been updated by the primary.void
updateLocalCheckpoint(String allocationId, long localCheckpoint)
Notifies the service to update the local checkpoint for the shard with the provided allocation ID.void
updateRetentionLeasesOnReplica(RetentionLeases retentionLeases)
Updates retention leases on a replica.Methods inherited from class org.elasticsearch.index.shard.AbstractIndexShardComponent
indexSettings, shardId
-
Field Details
-
PEER_RECOVERY_RETENTION_LEASE_SOURCE
Source for peer recovery retention leases; seeaddPeerRecoveryRetentionLease(java.lang.String, long, org.elasticsearch.action.ActionListener<org.elasticsearch.action.support.replication.ReplicationResponse>)
.- See Also:
- Constant Field Values
-
-
Constructor Details
-
ReplicationTracker
public ReplicationTracker(ShardId shardId, String allocationId, IndexSettings indexSettings, long operationPrimaryTerm, long globalCheckpoint, LongConsumer onGlobalCheckpointUpdated, LongSupplier currentTimeMillisSupplier, BiConsumer<RetentionLeases,ActionListener<ReplicationResponse>> onSyncRetentionLeases, Supplier<SafeCommitInfo> safeCommitInfoSupplier) -
ReplicationTracker
public ReplicationTracker(ShardId shardId, String allocationId, IndexSettings indexSettings, long operationPrimaryTerm, long globalCheckpoint, LongConsumer onGlobalCheckpointUpdated, LongSupplier currentTimeMillisSupplier, BiConsumer<RetentionLeases,ActionListener<ReplicationResponse>> onSyncRetentionLeases, Supplier<SafeCommitInfo> safeCommitInfoSupplier, Consumer<ReplicationGroup> onReplicationGroupUpdated)Initialize the global checkpoint service. The specified global checkpoint should be set to the last known global checkpoint, orSequenceNumbers.UNASSIGNED_SEQ_NO
.- Parameters:
shardId
- the shard IDallocationId
- the allocation IDindexSettings
- the index settingsoperationPrimaryTerm
- the current primary termglobalCheckpoint
- the last known global checkpoint for this shard, orSequenceNumbers.UNASSIGNED_SEQ_NO
onSyncRetentionLeases
- a callback when a new retention lease is created or an existing retention lease expiresonReplicationGroupUpdated
- a callback when the replica group changes
-
-
Method Details
-
getRetentionLeases
Get all retention leases tracked on this shard.- Returns:
- the retention leases
-
getRetentionLeases
If the expire leases parameter is false, gets all retention leases tracked on this shard and otherwise first calculates expiration of existing retention leases, and then gets all non-expired retention leases tracked on this shard. Note that only the primary shard calculates which leases are expired, and if any have expired, syncs the retention leases to any replicas. If the expire leases parameter is true, this replication tracker must be in primary mode.- Returns:
- the non-expired retention leases
-
addRetentionLease
public RetentionLease addRetentionLease(String id, long retainingSequenceNumber, String source, ActionListener<ReplicationResponse> listener)Adds a new retention lease.- Parameters:
id
- the identifier of the retention leaseretainingSequenceNumber
- the retaining sequence numbersource
- the source of the retention leaselistener
- the callback when the retention lease is successfully added and synced to replicas- Returns:
- the new retention lease
- Throws:
RetentionLeaseAlreadyExistsException
- if the specified retention lease already exists
-
renewRetentionLease
Renews an existing retention lease.- Parameters:
id
- the identifier of the retention leaseretainingSequenceNumber
- the retaining sequence numbersource
- the source of the retention lease- Returns:
- the renewed retention lease
- Throws:
RetentionLeaseNotFoundException
- if the specified retention lease does not existRetentionLeaseInvalidRetainingSeqNoException
- if the new retaining sequence number is lower than the retaining sequence number of the current retention lease.
-
removeRetentionLease
Removes an existing retention lease.- Parameters:
id
- the identifier of the retention leaselistener
- the callback when the retention lease is successfully removed and synced to replicas
-
updateRetentionLeasesOnReplica
Updates retention leases on a replica.- Parameters:
retentionLeases
- the retention leases
-
loadRetentionLeases
Loads the latest retention leases from their dedicated state file.- Parameters:
path
- the path to the directory containing the state file- Returns:
- the retention leases
- Throws:
IOException
- if an I/O exception occurs reading the retention leases
-
persistRetentionLeases
Persists the current retention leases to their dedicated state file. If this version of the retention leases are already persisted then persistence is skipped.- Parameters:
path
- the path to the directory containing the state file- Throws:
WriteStateException
- if an exception occurs writing the state file
-
assertRetentionLeasesPersisted
- Throws:
IOException
-
addPeerRecoveryRetentionLease
public RetentionLease addPeerRecoveryRetentionLease(String nodeId, long globalCheckpoint, ActionListener<ReplicationResponse> listener)Retention leases for peer recovery have sourcePEER_RECOVERY_RETENTION_LEASE_SOURCE
, a lease ID containing the persistent node ID calculated bygetPeerRecoveryRetentionLeaseId(java.lang.String)
, and retain operations with sequence numbers strictly greater than the given global checkpoint. -
cloneLocalPeerRecoveryRetentionLease
public RetentionLease cloneLocalPeerRecoveryRetentionLease(String nodeId, ActionListener<ReplicationResponse> listener) -
removePeerRecoveryRetentionLease
public void removePeerRecoveryRetentionLease(String nodeId, ActionListener<ReplicationResponse> listener) -
getPeerRecoveryRetentionLeaseId
Id for a peer recovery retention lease for the given node. SeeaddPeerRecoveryRetentionLease(java.lang.String, long, org.elasticsearch.action.ActionListener<org.elasticsearch.action.support.replication.ReplicationResponse>)
. -
getPeerRecoveryRetentionLeaseId
Id for a peer recovery retention lease for the givenShardRouting
. SeeaddPeerRecoveryRetentionLease(java.lang.String, long, org.elasticsearch.action.ActionListener<org.elasticsearch.action.support.replication.ReplicationResponse>)
. -
getPeerRecoveryRetentionLeases
Returns a list of peer recovery retention leases installed in this replication group -
renewPeerRecoveryRetentionLeases
public void renewPeerRecoveryRetentionLeases()Advance the peer-recovery retention leases for all assigned shard copies to discard history below the corresponding global checkpoint, and renew any leases that are approaching expiry. -
getInSyncGlobalCheckpoints
Get the local knowledge of the persisted global checkpoints for all in-sync allocation IDs.- Returns:
- a map from allocation ID to the local knowledge of the persisted global checkpoint for that allocation ID
-
isPrimaryMode
public boolean isPrimaryMode()Returns whether the replication tracker is in primary mode, i.e., whether the current shard is acting as primary from the point of view of replication. -
getOperationPrimaryTerm
public long getOperationPrimaryTerm()Returns the current operation primary term.- Returns:
- the primary term
-
setOperationPrimaryTerm
public void setOperationPrimaryTerm(long operationPrimaryTerm)Sets the current operation primary term. This method should be invoked only when no other operations are possible on the shard. That is, either from the constructor ofIndexShard
or while holding all permits on theIndexShard
instance.- Parameters:
operationPrimaryTerm
- the new operation primary term
-
isRelocated
public boolean isRelocated()Returns whether the replication tracker has relocated away to another shard copy. -
getReplicationGroup
Returns the current replication group for the shard.- Returns:
- the replication group
-
getGlobalCheckpoint
public long getGlobalCheckpoint()Returns the in-memory global checkpoint for the shard.- Returns:
- the global checkpoint
-
getAsLong
public long getAsLong()- Specified by:
getAsLong
in interfaceLongSupplier
-
updateGlobalCheckpointOnReplica
Updates the global checkpoint on a replica shard after it has been updated by the primary.- Parameters:
newGlobalCheckpoint
- the new global checkpointreason
- the reason the global checkpoint was updated
-
updateGlobalCheckpointForShard
Update the local knowledge of the persisted global checkpoint for the specified allocation ID.- Parameters:
allocationId
- the allocation ID to update the global checkpoint forglobalCheckpoint
- the global checkpoint
-
activatePrimaryMode
public void activatePrimaryMode(long localCheckpoint)Initializes the global checkpoint tracker in primary mode (seeprimaryMode
. Called on primary activation or promotion. -
updateFromMaster
public void updateFromMaster(long applyingClusterStateVersion, Set<String> inSyncAllocationIds, IndexShardRoutingTable routingTable)Notifies the tracker of the current allocation IDs in the cluster state.- Parameters:
applyingClusterStateVersion
- the cluster state version being applied when updating the allocation IDs from the masterinSyncAllocationIds
- the allocation IDs of the currently in-sync shard copiesroutingTable
- the shard routing table
-
initiateTracking
Called when the recovery process for a shard has opened the engine on the target shard. Ensures that the right data structures have been set up locally to track local checkpoint information for the shard and that the shard is added to the replication group.- Parameters:
allocationId
- the allocation ID of the shard for which recovery was initiated
-
markAllocationIdAsInSync
public void markAllocationIdAsInSync(String allocationId, long localCheckpoint) throws InterruptedExceptionMarks the shard with the provided allocation ID as in-sync with the primary shard. This method will block until the local checkpoint on the specified shard advances above the current global checkpoint.- Parameters:
allocationId
- the allocation ID of the shard to mark as in-synclocalCheckpoint
- the current local checkpoint on the shard- Throws:
InterruptedException
-
updateLocalCheckpoint
Notifies the service to update the local checkpoint for the shard with the provided allocation ID. If the checkpoint is lower than the currently known one, this is a no-op. If the allocation ID is not tracked, it is ignored.- Parameters:
allocationId
- the allocation ID of the shard to update the local checkpoint forlocalCheckpoint
- the local checkpoint for the shard
-
startRelocationHandoff
Initiates a relocation handoff and returns the corresponding primary context. -
abortRelocationHandoff
public void abortRelocationHandoff()Fails a relocation handoff attempt. -
completeRelocationHandoff
public void completeRelocationHandoff()Marks a relocation handoff attempt as successful. Moves the tracker into replica mode. -
activateWithPrimaryContext
Activates the global checkpoint tracker in primary mode (seeprimaryMode
. Called on primary relocation target during primary relocation handoff.- Parameters:
primaryContext
- the primary context used to initialize the state
-
hasAllPeerRecoveryRetentionLeases
public boolean hasAllPeerRecoveryRetentionLeases() -
createMissingPeerRecoveryRetentionLeases
Create any required peer-recovery retention leases that do not currently exist because we just did a rolling upgrade from a version prior toVersion.V_7_4_0
that does not create peer-recovery retention leases. -
pendingInSync
public boolean pendingInSync()Whether there are shards blocking global checkpoint advancement. -
getTrackedLocalCheckpointForShard
Returns the local checkpoint information tracked for a specific shard. Used by tests.
-