Populate lexicon from file, with one entry per line, consisting of space-separated tokens.
All a lines from the input File to this lexicon.
All a lines from the input File to this lexicon. File contains multiple newline-separated lexicon entries
All a lines from the input String to this lexicon.
All a lines from the input String to this lexicon. String contains multiple newline-separated lexicon entries
All a lines from the input Source to this lexicon.
All a lines from the input Source to this lexicon. Source is assumed to contain multiple newline-separated lexicon entries
Add a new lexicon entry consisting of one or more words.
Add a new lexicon entry consisting of one or more words. The Lexicon's tokenizer will be used to split the string, if possible.
Is 'query' in the lexicon, accounting for lexicon phrases and the context of 'query'
Is 'query' in the lexicon, accounting for lexicon phrases and the context of 'query'
Is the input String in the lexicon.
Is the input String in the lexicon. The input is tokenized and lemmatized; if the tokenizer indicates that it is a multi-word phrase, it will be processed by containsWords, otherwise containsWord.
Do any of the Lexicon entries contain the given word string.
Do any of the Lexicon entries contain the given word string.
Is the pre-tokenized sequence of words in the lexicon? The input words are expected to already be processed by the lemmatizer.
Is the pre-tokenized sequence of words in the lexicon? The input words are expected to already be processed by the lemmatizer.
Is 'query' in the lexicon, ignoring context.
Is this single word in the lexicon? The input String will not be processed by tokenizer, but will be processed by the lemmatizer.
Is this single word in the lexicon? The input String will not be processed by tokenizer, but will be processed by the lemmatizer.
Is the pre-tokenized sequence of words in the lexicon? Each of the input words will be processed by the lemmatizer.
Is the pre-tokenized sequence of words in the lexicon? Each of the input words will be processed by the lemmatizer.
The string lemmatizer that simplifies lexicon entries and queries before searching for a match.
The string lemmatizer that simplifies lexicon entries and queries before searching for a match. For example, a common lemmatizer is one that lowercases all strings.
An identifier for this lexicon, suitable for adding as a category to a FeatureVectorVariable[String].
An identifier for this lexicon, suitable for adding as a category to a FeatureVectorVariable[String].
String contains multiple newline-separated lexicon entries
Return length of match, or -1 if no match.
The string segmenter that breaks a lexicon entries and queries into (potentially) multi-word phrases.
The string segmenter that breaks a lexicon entries and queries into (potentially) multi-word phrases.
A list of words or phrases, with methods to check whether a String, Seq[String], or Token (or more generally a cc.factorie.app.chain.Observation) is in the list.