Class LanguageProfileImpl
java.lang.Object
com.optimaize.langdetect.profiles.LanguageProfileImpl
- All Implemented Interfaces:
LanguageProfile
This class is immutable.
-
Nested Class Summary
Nested Classes -
Field Summary
Fields -
Constructor Summary
Constructors -
Method Summary
Modifier and TypeMethodDescriptionboolean
int
getFrequency
(String gram) Tells what the n in n-grams are used here.@NotNull LdLocale
long
getMaxGramCount
(int gramLength) Tells how often the n-gram with the highest amount of occurrences used in this profile occurred.long
getMinGramCount
(int gramLength) Tells how often the n-gram with the lowest amount of occurrences used in this profile occurred.long
getNumGramOccurrences
(int gramLength) Tells how often all n-grams of a certain length occurred, combined.int
Tells how many n-grams there are for all n-gram sizes combined.int
getNumGrams
(int gramLength) Tells how many different n-grams there are for a certain n-gram size.int
hashCode()
Iterates all ngram strings with frequency.iterateGrams
(int gramLength) Iterates all gramLength-gram strings with frequency.private static LanguageProfileImpl.Stats
toString()
-
Field Details
-
locale
-
ngrams
-
stats
-
-
Constructor Details
-
Method Details
-
makeStats
-
getLocale
- Specified by:
getLocale
in interfaceLanguageProfile
-
getGramLengths
Description copied from interface:LanguageProfile
Tells what the n in n-grams are used here. Example: [1,2,3]- Specified by:
getGramLengths
in interfaceLanguageProfile
- Returns:
- Sorted from smaller to larger.
-
getFrequency
- Specified by:
getFrequency
in interfaceLanguageProfile
- Parameters:
gram
- for example "a" or "foo".- Returns:
- 0-n, also zero if this profile does not use n-grams of that length (for example if no 4-grams are made).
-
getNumGrams
public int getNumGrams(int gramLength) Description copied from interface:LanguageProfile
Tells how many different n-grams there are for a certain n-gram size. For example the English language has about 57 different 1-grams, whereas Chinese in Hani has thousands.- Specified by:
getNumGrams
in interfaceLanguageProfile
- Parameters:
gramLength
- 1-n- Returns:
- 0-n, returns zero if no such n-grams were made (for example if no 4-grams were made), or if all the training text did not contain such long words.
-
getNumGrams
public int getNumGrams()Description copied from interface:LanguageProfile
Tells how many n-grams there are for all n-gram sizes combined.- Specified by:
getNumGrams
in interfaceLanguageProfile
- Returns:
- 0-n (0 only on an empty profile...)
-
getNumGramOccurrences
public long getNumGramOccurrences(int gramLength) Description copied from interface:LanguageProfile
Tells how often all n-grams of a certain length occurred, combined. This returns a much larger number thanLanguageProfile.getNumGrams(int)
.- Specified by:
getNumGramOccurrences
in interfaceLanguageProfile
- Parameters:
gramLength
- 1-n- Returns:
- 0-n, returns zero if no such n-grams were made (for example if no 4-grams were made), or if all the training text did not contain such long words.
-
getMinGramCount
public long getMinGramCount(int gramLength) Description copied from interface:LanguageProfile
Tells how often the n-gram with the lowest amount of occurrences used in this profile occurred. Most likely there were n-grams with less (unless the returned number is 1), but they were eliminated in order to keep the profile reasonably small. This is the opposite of getMaxGramCount().- Specified by:
getMinGramCount
in interfaceLanguageProfile
- Parameters:
gramLength
- 1-n- Returns:
- 0-n, returns zero if no such n-grams were made or existed.
-
getMaxGramCount
public long getMaxGramCount(int gramLength) Description copied from interface:LanguageProfile
Tells how often the n-gram with the highest amount of occurrences used in this profile occurred. This is the opposite of getMinGramCount().- Specified by:
getMaxGramCount
in interfaceLanguageProfile
- Parameters:
gramLength
- 1-n- Returns:
- 0-n, returns zero if no such n-grams were made or existed.
-
iterateGrams
Description copied from interface:LanguageProfile
Iterates all ngram strings with frequency.- Specified by:
iterateGrams
in interfaceLanguageProfile
-
iterateGrams
Description copied from interface:LanguageProfile
Iterates all gramLength-gram strings with frequency.- Specified by:
iterateGrams
in interfaceLanguageProfile
-
toString
-
equals
-
hashCode
public int hashCode()
-