public class BlockWriter extends Object
According the Uniform Split technique, the writing combines three steps per block, and it is repeated for all the field blocks:
minimal distinguishing prefix
(MDP) in the neighborhood of the target block size
(+- delta size
)block file
.
And its block key is added
to the index dictionary
.
This stateful BlockWriter
is called repeatedly to
add
all the BlockLine
terms of a field. Then finishLastBlock(org.apache.lucene.codecs.uniformsplit.IndexDictionary.Builder)
is called. And then this BlockWriter
can be reused to add the terms
of another field.
Modifier and Type | Field and Description |
---|---|
protected BlockEncoder |
blockEncoder |
protected BlockHeader.Serializer |
blockHeaderWriter |
protected List<BlockLine> |
blockLines |
protected ByteBuffersDataOutput |
blockLinesWriteBuffer |
protected BlockLine.Serializer |
blockLineWriter |
protected IndexOutput |
blockOutput |
protected ByteBuffersDataOutput |
blockWriteBuffer |
protected int |
deltaNumLines |
protected FieldMetadata |
fieldMetadata |
protected BytesRef |
lastTerm |
protected BlockHeader |
reusableBlockHeader |
protected BytesRef |
scratchBytesRef |
protected int |
targetNumBlockLines |
protected DeltaBaseTermStateSerializer |
termStateSerializer |
protected ByteBuffersDataOutput |
termStatesWriteBuffer |
Modifier | Constructor and Description |
---|---|
protected |
BlockWriter(IndexOutput blockOutput,
int targetNumBlockLines,
int deltaNumLines,
BlockEncoder blockEncoder) |
Modifier and Type | Method and Description |
---|---|
protected void |
addBlockKey(List<BlockLine> blockLines,
IndexDictionary.Builder dictionaryBuilder)
Adds a new block key with its corresponding block file pointer to the
IndexDictionary.Builder . |
protected void |
addLine(BytesRef term,
BlockTermState blockTermState,
IndexDictionary.Builder dictionaryBuilder)
Adds a new
BlockLine term for the current field. |
protected BlockHeader.Serializer |
createBlockHeaderSerializer() |
protected BlockLine.Serializer |
createBlockLineSerializer() |
protected DeltaBaseTermStateSerializer |
createDeltaBaseTermStateSerializer() |
protected void |
finishLastBlock(IndexDictionary.Builder dictionaryBuilder)
This method is called when there is no more term for the field.
|
protected void |
splitAndWriteBlock(IndexDictionary.Builder dictionaryBuilder)
Defines the new block start according to
targetNumBlockLines
and deltaNumLines . |
protected void |
updateFieldMetadata(long blockStartFP)
updates the field metadata after all lines were written for the block.
|
protected void |
writeBlock(List<BlockLine> blockLines,
IndexDictionary.Builder dictionaryBuilder)
Writes a block and adds its block key to the dictionary builder.
|
protected void |
writeBlockLine(boolean isIncrementalEncodingSeed,
BlockLine line,
BlockLine previousLine) |
protected final int targetNumBlockLines
protected final int deltaNumLines
protected final IndexOutput blockOutput
protected final ByteBuffersDataOutput blockLinesWriteBuffer
protected final ByteBuffersDataOutput termStatesWriteBuffer
protected final BlockHeader.Serializer blockHeaderWriter
protected final BlockLine.Serializer blockLineWriter
protected final DeltaBaseTermStateSerializer termStateSerializer
protected final BlockEncoder blockEncoder
protected final ByteBuffersDataOutput blockWriteBuffer
protected FieldMetadata fieldMetadata
protected BytesRef lastTerm
protected final BlockHeader reusableBlockHeader
protected BytesRef scratchBytesRef
protected BlockWriter(IndexOutput blockOutput, int targetNumBlockLines, int deltaNumLines, BlockEncoder blockEncoder)
protected BlockHeader.Serializer createBlockHeaderSerializer()
protected BlockLine.Serializer createBlockLineSerializer()
protected DeltaBaseTermStateSerializer createDeltaBaseTermStateSerializer()
protected void addLine(BytesRef term, BlockTermState blockTermState, IndexDictionary.Builder dictionaryBuilder) throws IOException
BlockLine
term for the current field.
This method determines whether the new term is part of the current block,
or if it is part of the next block. In the latter case, a new block is started
(including one or more of the lastly added lines), the current block is
written to the block file, and the current block key is added to the
IndexDictionary.Builder
.
term
- The block line term. The BytesRef
instance is used directly,
the caller is responsible to make a deep copy if needed. This is required
because we keep a list of block lines until we decide to write the
current block, and each line must have a different term instance.blockTermState
- Block line details.dictionaryBuilder
- to which the block keys are added.IOException
protected void finishLastBlock(IndexDictionary.Builder dictionaryBuilder) throws IOException
addLine(org.apache.lucene.util.BytesRef, org.apache.lucene.codecs.BlockTermState, org.apache.lucene.codecs.uniformsplit.IndexDictionary.Builder)
as the last block of the
field and resets this BlockWriter
state. Then this BlockWriter
can be used for another field.IOException
protected void splitAndWriteBlock(IndexDictionary.Builder dictionaryBuilder) throws IOException
targetNumBlockLines
and deltaNumLines
.
The new block is started (including one or more of the lastly added lines),
the current block is written to the block file, and the current block key
is added to the IndexDictionary.Builder
.IOException
protected void writeBlock(List<BlockLine> blockLines, IndexDictionary.Builder dictionaryBuilder) throws IOException
IOException
protected void updateFieldMetadata(long blockStartFP)
protected void writeBlockLine(boolean isIncrementalEncodingSeed, BlockLine line, BlockLine previousLine) throws IOException
IOException
protected void addBlockKey(List<BlockLine> blockLines, IndexDictionary.Builder dictionaryBuilder) throws IOException
IndexDictionary.Builder
.
The block key is the MDP (see TermBytes
) of the block first term.IOException
Copyright © 2000-2021 Apache Software Foundation. All Rights Reserved.