public class WordlistLoader extends Object
to obtain {@link Reader} instances
Modifier and Type | Method and Description |
---|---|
static List<String> |
getLines(InputStream stream,
Charset charset)
Accesses a resource by name and returns the (non comment) lines containing
data using the given character encoding.
|
static CharArraySet |
getSnowballWordSet(Reader reader)
Reads stopwords from a stopword list in Snowball format.
|
static CharArraySet |
getSnowballWordSet(Reader reader,
CharArraySet result)
Reads stopwords from a stopword list in Snowball format.
|
static CharArrayMap<String> |
getStemDict(Reader reader,
CharArrayMap<String> result)
Reads a stem dictionary.
|
static CharArraySet |
getWordSet(Reader reader)
Reads lines from a Reader and adds every line as an entry to a CharArraySet (omitting
leading and trailing whitespace).
|
static CharArraySet |
getWordSet(Reader reader,
CharArraySet result)
Reads lines from a Reader and adds every line as an entry to a CharArraySet (omitting
leading and trailing whitespace).
|
static CharArraySet |
getWordSet(Reader reader,
String comment)
Reads lines from a Reader and adds every non-comment line as an entry to a CharArraySet (omitting
leading and trailing whitespace).
|
static CharArraySet |
getWordSet(Reader reader,
String comment,
CharArraySet result)
Reads lines from a Reader and adds every non-comment line as an entry to a CharArraySet (omitting
leading and trailing whitespace).
|
public static CharArraySet getWordSet(Reader reader, CharArraySet result) throws IOException
reader
- Reader containing the wordlistresult
- the CharArraySet
to fill with the readers wordsCharArraySet
with the reader's wordsIOException
public static CharArraySet getWordSet(Reader reader) throws IOException
reader
- Reader containing the wordlistCharArraySet
with the reader's wordsIOException
public static CharArraySet getWordSet(Reader reader, String comment) throws IOException
reader
- Reader containing the wordlistcomment
- The string representing a comment.IOException
public static CharArraySet getWordSet(Reader reader, String comment, CharArraySet result) throws IOException
reader
- Reader containing the wordlistcomment
- The string representing a comment.result
- the CharArraySet
to fill with the readers wordsCharArraySet
with the reader's wordsIOException
public static CharArraySet getSnowballWordSet(Reader reader, CharArraySet result) throws IOException
The snowball format is the following:
reader
- Reader containing a Snowball stopword listresult
- the CharArraySet
to fill with the readers wordsCharArraySet
with the reader's wordsIOException
public static CharArraySet getSnowballWordSet(Reader reader) throws IOException
The snowball format is the following:
reader
- Reader containing a Snowball stopword listCharArraySet
with the reader's wordsIOException
public static CharArrayMap<String> getStemDict(Reader reader, CharArrayMap<String> result) throws IOException
word\tstem(i.e. two tab separated words)
IOException
- If there is a low-level I/O error.public static List<String> getLines(InputStream stream, Charset charset) throws IOException
A comment line is any line that starts with the character "#"
IOException
- If there is a low-level I/O error.Copyright © 2000-2021 Apache Software Foundation. All Rights Reserved.