public final class FST<T> extends Object implements Accountable
The format is similar to what's used by Morfologik (https://github.com/morfologik/morfologik-stemming).
See the package
documentation
for some simple examples.
Modifier and Type | Class and Description |
---|---|
static class |
FST.Arc<T>
Represents a single arc.
|
static class |
FST.BytesReader
Reads bytes stored in an FST.
|
static class |
FST.INPUT_TYPE
Specifies allowed range of each int input label for
this FST.
|
Modifier and Type | Field and Description |
---|---|
static byte |
ARCS_FOR_BINARY_SEARCH
Value of the arc flags to declare a node with fixed length arcs
designed for binary search.
|
static int |
BIT_ARC_HAS_OUTPUT
This flag is set if the arc has an output.
|
static int |
END_LABEL
If arc has this label then that arc is final/accepted
|
Outputs<T> |
outputs |
NULL_ACCOUNTABLE
Constructor and Description |
---|
FST(DataInput metaIn,
DataInput in,
Outputs<T> outputs)
Load a previously saved FST.
|
FST(DataInput metaIn,
DataInput in,
Outputs<T> outputs,
FSTStore fstStore)
Load a previously saved FST; maxBlockBits allows you to
control the size of the byte[] pages used to hold the FST bytes.
|
Modifier and Type | Method and Description |
---|---|
FST.Arc<T> |
findTargetArc(int labelToMatch,
FST.Arc<T> follow,
FST.Arc<T> arc,
FST.BytesReader in)
Finds an arc leaving the incoming arc, replacing the arc in place.
|
FST.BytesReader |
getBytesReader()
Returns a
FST.BytesReader for this FST, positioned at
position 0. |
T |
getEmptyOutput() |
FST.Arc<T> |
getFirstArc(FST.Arc<T> arc)
Fills virtual 'start' arc, ie, an empty incoming arc to the FST's start node
|
long |
ramBytesUsed()
Return the memory usage of this object in bytes.
|
static <T> FST<T> |
read(Path path,
Outputs<T> outputs)
Reads an automaton from a file.
|
FST.Arc<T> |
readArcByDirectAddressing(FST.Arc<T> arc,
FST.BytesReader in,
int rangeIndex)
Reads a present direct addressing node arc, with the provided index in the label range.
|
FST.Arc<T> |
readArcByIndex(FST.Arc<T> arc,
FST.BytesReader in,
int idx) |
FST.Arc<T> |
readFirstRealTargetArc(long nodeAddress,
FST.Arc<T> arc,
FST.BytesReader in) |
FST.Arc<T> |
readFirstTargetArc(FST.Arc<T> follow,
FST.Arc<T> arc,
FST.BytesReader in)
Follow the
follow arc and read the first arc of its target;
this changes the provided arc (2nd arg) in-place and returns
it. |
int |
readLabel(DataInput in)
Reads one BYTE1/2/4 label from the provided
DataInput . |
FST.Arc<T> |
readLastArcByDirectAddressing(FST.Arc<T> arc,
FST.BytesReader in)
Reads the last arc of a direct addressing node.
|
FST.Arc<T> |
readNextArc(FST.Arc<T> arc,
FST.BytesReader in)
In-place read; returns the arc.
|
FST.Arc<T> |
readNextRealArc(FST.Arc<T> arc,
FST.BytesReader in)
Never returns null, but you should never call this if
arc.isLast() is true.
|
void |
save(DataOutput metaOut,
DataOutput out) |
void |
save(Path path)
Writes an automaton to a file.
|
static <T> boolean |
targetHasArcs(FST.Arc<T> arc)
returns true if the node at this address has any
outgoing arcs
|
String |
toString() |
clone, equals, finalize, getClass, hashCode, notify, notifyAll, wait, wait, wait
getChildResources
public static final int BIT_ARC_HAS_OUTPUT
public static final byte ARCS_FOR_BINARY_SEARCH
public static final int END_LABEL
public FST(DataInput metaIn, DataInput in, Outputs<T> outputs) throws IOException
IOException
public FST(DataInput metaIn, DataInput in, Outputs<T> outputs, FSTStore fstStore) throws IOException
IOException
public long ramBytesUsed()
Accountable
ramBytesUsed
in interface Accountable
public T getEmptyOutput()
public void save(DataOutput metaOut, DataOutput out) throws IOException
IOException
public void save(Path path) throws IOException
IOException
public static <T> FST<T> read(Path path, Outputs<T> outputs) throws IOException
IOException
public int readLabel(DataInput in) throws IOException
DataInput
.IOException
public static <T> boolean targetHasArcs(FST.Arc<T> arc)
public FST.Arc<T> getFirstArc(FST.Arc<T> arc)
public FST.Arc<T> readFirstTargetArc(FST.Arc<T> follow, FST.Arc<T> arc, FST.BytesReader in) throws IOException
follow
arc and read the first arc of its target;
this changes the provided arc
(2nd arg) in-place and returns
it.arc
).IOException
public FST.Arc<T> readFirstRealTargetArc(long nodeAddress, FST.Arc<T> arc, FST.BytesReader in) throws IOException
IOException
public FST.Arc<T> readNextArc(FST.Arc<T> arc, FST.BytesReader in) throws IOException
IOException
public FST.Arc<T> readArcByIndex(FST.Arc<T> arc, FST.BytesReader in, int idx) throws IOException
IOException
public FST.Arc<T> readArcByDirectAddressing(FST.Arc<T> arc, FST.BytesReader in, int rangeIndex) throws IOException
rangeIndex
- The index of the arc in the label range. It must be present.
The real arc offset is computed based on the presence bits of
the direct addressing node.IOException
public FST.Arc<T> readLastArcByDirectAddressing(FST.Arc<T> arc, FST.BytesReader in) throws IOException
readArcByDirectAddressing(Arc, BytesReader, int)
with rangeIndex
equal to arc.numArcs() - 1
, but it is faster.IOException
public FST.Arc<T> readNextRealArc(FST.Arc<T> arc, FST.BytesReader in) throws IOException
IOException
public FST.Arc<T> findTargetArc(int labelToMatch, FST.Arc<T> follow, FST.Arc<T> arc, FST.BytesReader in) throws IOException
IOException
public FST.BytesReader getBytesReader()
FST.BytesReader
for this FST, positioned at
position 0.Copyright © 2000-2021 Apache Software Foundation. All Rights Reserved.