Tokenize and lemmatize the input String and add it as a single entry to the Lexicon
Is this Token (or more generally Observation) a member of a phrase in the lexicon (including single-word phrases)? The query.
Is this Token (or more generally Observation) a member of a phrase in the lexicon (including single-word phrases)? The query.string will be processed by the lemmatizer. For example if query.string is "New" and query.next.string is "York" and the two-word phrase "New York" is in the lexicon, then this method will return true. But if query.next.string is "shoes" (and "New shoes" is not in the lexicon) this method will return false.
Is this single word in the lexicon? The input String will not be processed by tokenizer, but will be processed by the lemmatizer.
Is this single word in the lexicon? The input String will not be processed by tokenizer, but will be processed by the lemmatizer.
Is the pre-tokenized sequence of words in the lexicon? The input words are expected to already be processed by the lemmatizer.
Is the pre-tokenized sequence of words in the lexicon? The input words are expected to already be processed by the lemmatizer.
The string lemmatizer that simplifies lexicon entries and queries before searching for a match.
The string lemmatizer that simplifies lexicon entries and queries before searching for a match. For example, a common lemmatizer is one that lowercases all strings.
An identifier for this lexicon, suitable for adding as a category to a FeatureVectorVariable[String].
An identifier for this lexicon, suitable for adding as a category to a FeatureVectorVariable[String].
The string segmenter that breaks a lexicon entries and queries into (potentially) multi-word phrases.
The string segmenter that breaks a lexicon entries and queries into (potentially) multi-word phrases.
All a lines from the input File to this lexicon.
All a lines from the input File to this lexicon. File contains multiple newline-separated lexicon entries
All a lines from the input String to this lexicon.
All a lines from the input String to this lexicon. String contains multiple newline-separated lexicon entries
All a lines from the input Source to this lexicon.
All a lines from the input Source to this lexicon. Source is assumed to contain multiple newline-separated lexicon entries
Is the input String in the lexicon.
Is the input String in the lexicon. The input is tokenized and lemmatized; if the tokenizer indicates that it is a multi-word phrase, it will be processed by containsWords, otherwise containsWord.
Is this single word in the lexicon? The input String will not be processed by tokenizer, but will be processed by the lemmatizer.
Is this single word in the lexicon? The input String will not be processed by tokenizer, but will be processed by the lemmatizer.
Is the pre-tokenized sequence of words in the lexicon? Each of the input words will be processed by the lemmatizer.
Is the pre-tokenized sequence of words in the lexicon? Each of the input words will be processed by the lemmatizer.