BlockWriter (Lucene 8.9.0 API)乐学网一站式学习平台

java.lang.Object
- org.apache.lucene.codecs.uniformsplit.BlockWriter

Direct Known Subclasses:

STBlockWriter
```
public class BlockWriter
extends Object
```
Writes blocks in the block file.
According the Uniform Split technique, the writing combines three steps per block, and it is repeated for all the field blocks:
1. Select the term with the shortest minimal distinguishing prefix (MDP) in the neighborhood of the target block size (+- delta size)
2. The selected term becomes the first term of the next block, and its MDP is the next block key.
3. The current block is written to the block file. And its block key is added to the index dictionary.
This stateful BlockWriter is called repeatedly to add all the BlockLine terms of a field. Then finishLastBlock(org.apache.lucene.codecs.uniformsplit.IndexDictionary.Builder) is called. And then this BlockWriter can be reused to add the terms of another field.
WARNING: This API is experimental and might change in incompatible ways in the next release.

Field Summary

Fields
Modifier and Type	Field and Description
`protected BlockEncoder`	`blockEncoder`
`protected BlockHeader.Serializer`	`blockHeaderWriter`
`protected List<BlockLine>`	`blockLines`
`protected ByteBuffersDataOutput`	`blockLinesWriteBuffer`
`protected BlockLine.Serializer`	`blockLineWriter`
`protected IndexOutput`	`blockOutput`
`protected ByteBuffersDataOutput`	`blockWriteBuffer`
`protected int`	`deltaNumLines`
`protected FieldMetadata`	`fieldMetadata`
`protected BytesRef`	`lastTerm`
`protected BlockHeader`	`reusableBlockHeader`
`protected BytesRef`	`scratchBytesRef`
`protected int`	`targetNumBlockLines`
`protected DeltaBaseTermStateSerializer`	`termStateSerializer`
`protected ByteBuffersDataOutput`	`termStatesWriteBuffer`

Constructor Summary

Constructors
Modifier	Constructor and Description
`protected`	`BlockWriter(IndexOutput blockOutput, int targetNumBlockLines, int deltaNumLines, BlockEncoder blockEncoder)`

Method Summary

All Methods Instance Methods Concrete Methods
Modifier and Type	Method and Description
`protected void`	`addBlockKey(List<BlockLine> blockLines, IndexDictionary.Builder dictionaryBuilder)` Adds a new block key with its corresponding block file pointer to the `IndexDictionary.Builder` .
`protected void`	`addLine(BytesRef term, BlockTermState blockTermState, IndexDictionary.Builder dictionaryBuilder)` Adds a new `BlockLine` term for the current field.
`protected BlockHeader.Serializer`	`createBlockHeaderSerializer()`
`protected BlockLine.Serializer`	`createBlockLineSerializer()`
`protected DeltaBaseTermStateSerializer`	`createDeltaBaseTermStateSerializer()`
`protected void`	`finishLastBlock(IndexDictionary.Builder dictionaryBuilder)` This method is called when there is no more term for the field.
`protected void`	`splitAndWriteBlock(IndexDictionary.Builder dictionaryBuilder)` Defines the new block start according to `targetNumBlockLines` and `deltaNumLines`.
`protected void`	`updateFieldMetadata(long blockStartFP)` updates the field metadata after all lines were written for the block.
`protected void`	`writeBlock(List<BlockLine> blockLines, IndexDictionary.Builder dictionaryBuilder)` Writes a block and adds its block key to the dictionary builder.
`protected void`	`writeBlockLine(boolean isIncrementalEncodingSeed, BlockLine line, BlockLine previousLine)`

Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait

Field Detail

targetNumBlockLines

protected final int targetNumBlockLines

deltaNumLines
```
protected final int deltaNumLines
```

blockLines

protected final List<BlockLine> blockLines

blockOutput

protected final IndexOutput blockOutput

blockLinesWriteBuffer

protected final ByteBuffersDataOutput blockLinesWriteBuffer

termStatesWriteBuffer

protected final ByteBuffersDataOutput termStatesWriteBuffer

blockHeaderWriter

protected final BlockHeader.Serializer blockHeaderWriter

blockLineWriter

protected final BlockLine.Serializer blockLineWriter

termStateSerializer

protected final DeltaBaseTermStateSerializer termStateSerializer

blockEncoder

protected final BlockEncoder blockEncoder

blockWriteBuffer

protected final ByteBuffersDataOutput blockWriteBuffer

fieldMetadata
```
protected FieldMetadata fieldMetadata
```

lastTerm
```
protected BytesRef lastTerm
```

reusableBlockHeader

protected final BlockHeader reusableBlockHeader

scratchBytesRef
```
protected BytesRef scratchBytesRef
```

Constructor Detail

BlockWriter

protected BlockWriter(IndexOutput blockOutput,
                      int targetNumBlockLines,
                      int deltaNumLines,
                      BlockEncoder blockEncoder)

Method Detail
- createBlockHeaderSerializer
```
protected BlockHeader.Serializer createBlockHeaderSerializer()
```
- createBlockLineSerializer
```
protected BlockLine.Serializer createBlockLineSerializer()
```
- createDeltaBaseTermStateSerializer
```
protected DeltaBaseTermStateSerializer createDeltaBaseTermStateSerializer()
```
- addLine
```
protected void addLine(BytesRef term,
                       BlockTermState blockTermState,
                       IndexDictionary.Builder dictionaryBuilder)
                throws IOException
```
  Adds a new BlockLine term for the current field.
  This method determines whether the new term is part of the current block, or if it is part of the next block. In the latter case, a new block is started (including one or more of the lastly added lines), the current block is written to the block file, and the current block key is added to the IndexDictionary.Builder.
  
  Parameters:
  
  term - The block line term. The BytesRef instance is used directly, the caller is responsible to make a deep copy if needed. This is required because we keep a list of block lines until we decide to write the current block, and each line must have a different term instance.
  
  blockTermState - Block line details.
  
  dictionaryBuilder - to which the block keys are added.
  
  Throws:
  
  IOException
- finishLastBlock
```
protected void finishLastBlock(IndexDictionary.Builder dictionaryBuilder)
                        throws IOException
```
  This method is called when there is no more term for the field. It writes the remaining lines added with addLine(org.apache.lucene.util.BytesRef, org.apache.lucene.codecs.BlockTermState, org.apache.lucene.codecs.uniformsplit.IndexDictionary.Builder) as the last block of the field and resets this BlockWriter state. Then this BlockWriter can be used for another field.
  
  Throws:
  
  IOException
- splitAndWriteBlock
```
protected void splitAndWriteBlock(IndexDictionary.Builder dictionaryBuilder)
                           throws IOException
```
  Defines the new block start according to targetNumBlockLines and deltaNumLines. The new block is started (including one or more of the lastly added lines), the current block is written to the block file, and the current block key is added to the IndexDictionary.Builder.
  
  Throws:
  
  IOException
- writeBlock
```
protected void writeBlock(List<BlockLine> blockLines,
                          IndexDictionary.Builder dictionaryBuilder)
                   throws IOException
```
  Writes a block and adds its block key to the dictionary builder.
  
  Throws:
  
  IOException
- updateFieldMetadata
```
protected void updateFieldMetadata(long blockStartFP)
```
  updates the field metadata after all lines were written for the block.
- writeBlockLine
```
protected void writeBlockLine(boolean isIncrementalEncodingSeed,
                              BlockLine line,
                              BlockLine previousLine)
                       throws IOException
```
  Throws:
  
  IOException
- addBlockKey
```
protected void addBlockKey(List<BlockLine> blockLines,
                           IndexDictionary.Builder dictionaryBuilder)
                    throws IOException
```
  Adds a new block key with its corresponding block file pointer to the IndexDictionary.Builder . The block key is the MDP (see TermBytes) of the block first term.
  
  Throws:
  
  IOException

Class BlockWriter

Field Summary

Constructor Summary

Method Summary

Methods inherited from class java.lang.Object

Field Detail

targetNumBlockLines

deltaNumLines

blockLines

blockOutput

blockLinesWriteBuffer

termStatesWriteBuffer

blockHeaderWriter

blockLineWriter

termStateSerializer

blockEncoder

blockWriteBuffer

fieldMetadata

lastTerm

reusableBlockHeader

scratchBytesRef

Constructor Detail

BlockWriter

Method Detail

createBlockHeaderSerializer

createBlockLineSerializer

createDeltaBaseTermStateSerializer

addLine

finishLastBlock

splitAndWriteBlock

writeBlock

updateFieldMetadata

writeBlockLine

addBlockKey