JapaneseTokenizerFactory (Lucene 8.9.0 API)乐学网一站式学习平台

java.lang.Object
- org.apache.lucene.analysis.util.AbstractAnalysisFactory
- - org.apache.lucene.analysis.util.TokenizerFactory
  - - org.apache.lucene.analysis.ja.JapaneseTokenizerFactory

All Implemented Interfaces:

ResourceLoaderAware
```
public class JapaneseTokenizerFactory
extends TokenizerFactory
implements ResourceLoaderAware
```
Factory for JapaneseTokenizer.
```
 <fieldType name="text_ja" class="solr.TextField">
   <analyzer>
     <tokenizer class="solr.JapaneseTokenizerFactory"
       mode="NORMAL"
       userDictionary="user.txt"
       userDictionaryEncoding="UTF-8"
       discardPunctuation="true"
       discardCompoundToken="false"
     />
     <filter class="solr.JapaneseBaseFormFilterFactory"/>
   </analyzer>
 </fieldType>
 
```
Additional expert user parameters nBestCost and nBestExamples can be used to include additional searchable tokens that those most likely according to the statistical model. A typical use-case for this is to improve recall and make segmentation more resilient to mistakes. The feature can also be used to get a decompounding effect.
The nBestCost parameter specifies an additional Viterbi cost, and when used, JapaneseTokenizer will include all tokens in Viterbi paths that are within the nBestCost value of the best path.
Finding a good value for nBestCost can be difficult to do by hand. The nBestExamples parameter can be used to find an nBestCost value based on examples with desired segmentation outcomes.
For example, a value of /箱根山-箱根/成田空港-成田/ indicates that in the texts, 箱根山 (Mt. Hakone) and 成田空港 (Narita Airport) we'd like a cost that gives is us 箱根 (Hakone) and 成田 (Narita). Notice that costs are estimated for each example individually, and the maximum nBestCost found across all examples is used.
If both nBestCost and nBestExamples is used in a configuration, the largest value of the two is used.
Parameters nBestCost and nBestExamples work with all tokenizer modes, but it makes the most sense to use them with NORMAL mode.
Since:

3.6.0

SPI Name (Note: This is case-insensitive. e.g., if the name is 'htmlStrip', 'htmlstrip' can be used when looking up the service):

"japanese"

- Field Summary
  
  Fields
  Modifier and Type Field and Description
  
  static String NAME
  SPI name
  - Fields inherited from class org.apache.lucene.analysis.util.AbstractAnalysisFactory
    LUCENE_MATCH_VERSION_PARAM, luceneMatchVersion
- Constructor Summary
  
  Constructors
  Constructor and Description
  
  JapaneseTokenizerFactory(Map<String,String> args)
  Creates a new JapaneseTokenizerFactory
- Method Summary
  
  All Methods Instance Methods Concrete Methods
  Modifier and Type Method and Description
  
  JapaneseTokenizer create(AttributeFactory factory)
  
  void inform(ResourceLoader loader)
  - Methods inherited from class org.apache.lucene.analysis.util.TokenizerFactory
    availableTokenizers, create, findSPIName, forName, lookupClass, reloadTokenizers
  - Methods inherited from class org.apache.lucene.analysis.util.AbstractAnalysisFactory
    get, get, get, get, get, getBoolean, getChar, getClassArg, getFloat, getInt, getLines, getLuceneMatchVersion, getOriginalArgs, getPattern, getSet, getSnowballWordSet, getWordSet, isExplicitLuceneMatchVersion, require, require, require, requireBoolean, requireChar, requireFloat, requireInt, setExplicitLuceneMatchVersion, splitAt, splitFileNames
  - Methods inherited from class java.lang.Object
    clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait

Fields
Modifier and Type	Field and Description
`static String`	`NAME` SPI name

Constructors
Constructor and Description
`JapaneseTokenizerFactory(Map<String,String> args)` Creates a new JapaneseTokenizerFactory

All Methods Instance Methods Concrete Methods
Modifier and Type	Method and Description
`JapaneseTokenizer`	`create(AttributeFactory factory)`
`void`	`inform(ResourceLoader loader)`

- Field Detail
  - NAME
```
public static final String NAME
```
    SPI name
    
    See Also:
    
    Constant Field Values
- Constructor Detail
  - JapaneseTokenizerFactory
```
public JapaneseTokenizerFactory(Map<String,String> args)
```
    Creates a new JapaneseTokenizerFactory
- Method Detail
  - inform
```
public void inform(ResourceLoader loader)
            throws IOException
```
    Specified by:
    
    inform in interface ResourceLoaderAware
    
    Throws:
    
    IOException
  - create
```
public JapaneseTokenizer create(AttributeFactory factory)
```
    Specified by:
    
    create in class TokenizerFactory

Class JapaneseTokenizerFactory

Field Summary

Fields inherited from class org.apache.lucene.analysis.util.AbstractAnalysisFactory

Constructor Summary

Method Summary

Methods inherited from class org.apache.lucene.analysis.util.TokenizerFactory

Methods inherited from class org.apache.lucene.analysis.util.AbstractAnalysisFactory

Methods inherited from class java.lang.Object

Field Detail

NAME

Constructor Detail

JapaneseTokenizerFactory

Method Detail

inform

create