java.io.Closeable
, java.lang.AutoCloseable
OxygenAnalyzerWithShingles
public class OxygenAnalyzerBase
extends org.apache.lucene.analysis.StopwordAnalyzerBase
Modifier and Type | Class | Description |
---|---|---|
private static class |
OxygenAnalyzerBase.DefaultSetHolder |
Atomically loads the DEFAULT_STOP_SET in a lazy fashion once the outer class
accesses the static final set the first time.;
|
Modifier and Type | Field | Description |
---|---|---|
static org.apache.lucene.analysis.CharArraySet |
OXYGEN_EXCLUSION_SET |
|
protected org.apache.lucene.analysis.CharArraySet |
stemExclusionSet |
|
protected org.apache.lucene.analysis.CharArraySet |
stopwords |
Constructor | Description |
---|---|
OxygenAnalyzerBase() |
Creates default Oxygen Analyzer
|
OxygenAnalyzerBase(org.apache.lucene.analysis.CharArraySet stopWords) |
Builds an analyzer with the given stop words.
|
OxygenAnalyzerBase(org.apache.lucene.analysis.CharArraySet stopWords,
org.apache.lucene.analysis.CharArraySet stemExclusionSet) |
Builds an analyzer with the given stop words.
|
Modifier and Type | Method | Description |
---|---|---|
protected org.apache.lucene.analysis.Analyzer.TokenStreamComponents |
createComponents(java.lang.String fieldName) |
|
static org.apache.lucene.analysis.CharArraySet |
getDefaultStopSet() |
Returns an unmodifiable instance of the default stop words set.
|
static java.lang.String |
getShingleInfo() |
|
protected org.apache.lucene.analysis.TokenStream |
normalize(java.lang.String fieldName,
org.apache.lucene.analysis.TokenStream in) |
attributeFactory, close, getOffsetGap, getPositionIncrementGap, getReuseStrategy, getVersion, initReader, initReaderForNormalization, normalize, setVersion, tokenStream, tokenStream
public static final org.apache.lucene.analysis.CharArraySet OXYGEN_EXCLUSION_SET
protected final org.apache.lucene.analysis.CharArraySet stemExclusionSet
protected final org.apache.lucene.analysis.CharArraySet stopwords
public OxygenAnalyzerBase()
public OxygenAnalyzerBase(org.apache.lucene.analysis.CharArraySet stopWords, org.apache.lucene.analysis.CharArraySet stemExclusionSet)
SetKeywordMarkerFilter
before
stemming.stopWords
- a stopword setstemExclusionSet
- a set of terms not to be stemmedpublic OxygenAnalyzerBase(org.apache.lucene.analysis.CharArraySet stopWords)
stopWords
- a stopword setpublic static org.apache.lucene.analysis.CharArraySet getDefaultStopSet()
public static java.lang.String getShingleInfo()
protected org.apache.lucene.analysis.TokenStream normalize(java.lang.String fieldName, org.apache.lucene.analysis.TokenStream in)
normalize
in class org.apache.lucene.analysis.Analyzer
protected org.apache.lucene.analysis.Analyzer.TokenStreamComponents createComponents(java.lang.String fieldName)
createComponents
in class org.apache.lucene.analysis.Analyzer