Package edu.berkeley.nlp.lm
Class AbstractContextEncodedNgramLanguageModel<W>
java.lang.Object
edu.berkeley.nlp.lm.AbstractNgramLanguageModel<W>
edu.berkeley.nlp.lm.AbstractContextEncodedNgramLanguageModel<W>
- All Implemented Interfaces:
ContextEncodedNgramLanguageModel<W>
,NgramLanguageModel<W>
,Serializable
- Direct Known Subclasses:
ContextEncodedCachingLmWrapper
,ContextEncodedProbBackoffLm
public abstract class AbstractContextEncodedNgramLanguageModel<W>
extends AbstractNgramLanguageModel<W>
implements ContextEncodedNgramLanguageModel<W>, Serializable
Default implementation of all ContextEncodedNgramLanguageModel functionality
except
getLogProb(long, int, int, LmContextInfo)
,
{@link #getOffsetForNgram(int[], int, int), and {
- See Also:
-
Nested Class Summary
Nested classes/interfaces inherited from interface edu.berkeley.nlp.lm.ContextEncodedNgramLanguageModel
ContextEncodedNgramLanguageModel.DefaultImplementations, ContextEncodedNgramLanguageModel.LmContextInfo
Nested classes/interfaces inherited from interface edu.berkeley.nlp.lm.NgramLanguageModel
NgramLanguageModel.StaticMethods
-
Field Summary
Fields inherited from class edu.berkeley.nlp.lm.AbstractNgramLanguageModel
lmOrder, oovWordLogProb
-
Constructor Summary
ConstructorsConstructorDescriptionAbstractContextEncodedNgramLanguageModel
(int lmOrder, WordIndexer<W> wordIndexer, float oovWordLogProb) -
Method Summary
Modifier and TypeMethodDescriptionabstract float
getLogProb
(long contextOffset, int contextOrder, int word, ContextEncodedNgramLanguageModel.LmContextInfo outputContext) Get the score for an n-gram, and also get the context offset of the n-gram's suffix.float
getLogProb
(List<W> phrase) Scores an n-gram.abstract int[]
getNgramForOffset
(long contextOffset, int contextOrder, int word) Gets the n-gram referred to by a context-encoding.getOffsetForNgram
(int[] ngram, int startPos, int endPos) Gets the offset which refers to an n-gram.float
scoreSentence
(List<W> sentence) Scores a complete sentence, taking appropriate care with the start- and end-of-sentence symbols.Methods inherited from class edu.berkeley.nlp.lm.AbstractNgramLanguageModel
getLmOrder, getWordIndexer, setOovWordLogProb
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
Methods inherited from interface edu.berkeley.nlp.lm.NgramLanguageModel
getLmOrder, getWordIndexer, setOovWordLogProb
-
Constructor Details
-
AbstractContextEncodedNgramLanguageModel
public AbstractContextEncodedNgramLanguageModel(int lmOrder, WordIndexer<W> wordIndexer, float oovWordLogProb)
-
-
Method Details
-
scoreSentence
Description copied from interface:NgramLanguageModel
Scores a complete sentence, taking appropriate care with the start- and end-of-sentence symbols. This is a convenience method and will generally be inefficient.- Specified by:
scoreSentence
in interfaceNgramLanguageModel<W>
- Returns:
-
getLogProb
Description copied from interface:NgramLanguageModel
Scores an n-gram. This is a convenience method and will generally be relatively inefficient. More efficient versions are available inArrayEncodedNgramLanguageModel.getLogProb(int[], int, int)
andContextEncodedNgramLanguageModel.getLogProb(long, int, int, edu.berkeley.nlp.lm.ContextEncodedNgramLanguageModel.LmContextInfo)
.- Specified by:
getLogProb
in interfaceNgramLanguageModel<W>
-
getLogProb
public abstract float getLogProb(long contextOffset, int contextOrder, int word, ContextEncodedNgramLanguageModel.LmContextInfo outputContext) Description copied from interface:ContextEncodedNgramLanguageModel
Get the score for an n-gram, and also get the context offset of the n-gram's suffix.- Specified by:
getLogProb
in interfaceContextEncodedNgramLanguageModel<W>
- Parameters:
contextOffset
- Offset of context (prefix) of an n-gramcontextOrder
- The (0-based) length ofcontext
(i.e.order == 0
iffcontext
refers to a unigram).word
- Last word of the n-gramoutputContext
- Offset of the suffix of the input n-gram. If the parameter isnull
it will be ignored. This can be passed to future queries for efficient access.- Returns:
-
getOffsetForNgram
public abstract ContextEncodedNgramLanguageModel.LmContextInfo getOffsetForNgram(int[] ngram, int startPos, int endPos) Description copied from interface:ContextEncodedNgramLanguageModel
Gets the offset which refers to an n-gram. If the n-gram is not in the model, then it returns the shortest suffix of the n-gram which is. This operation is not necessarily fast.- Specified by:
getOffsetForNgram
in interfaceContextEncodedNgramLanguageModel<W>
-
getNgramForOffset
public abstract int[] getNgramForOffset(long contextOffset, int contextOrder, int word) Description copied from interface:ContextEncodedNgramLanguageModel
Gets the n-gram referred to by a context-encoding. This operation is not necessarily fast.- Specified by:
getNgramForOffset
in interfaceContextEncodedNgramLanguageModel<W>
-