WordLexicon

Instance Constructors

new WordLexicon(name: String, tokenizer: StringSegmenter = ..., lemmatizer: Lemmatizer = ...)

Value Members

final def !=(arg0: AnyRef): Boolean

Definition Classes
AnyRef
final def !=(arg0: Any): Boolean

Definition Classes
Any
final def ##(): Int

Definition Classes
AnyRef → Any
def ++=(file: File, enc: String = "UTF-8"): WordLexicon.this.type

All a lines from the input File to this lexicon.
All a lines from the input File to this lexicon. File contains multiple newline-separated lexicon entries

Definition Classes
MutableLexicon
def ++=(phrases: String): WordLexicon.this.type

All a lines from the input String to this lexicon.
All a lines from the input String to this lexicon. String contains multiple newline-separated lexicon entries

Definition Classes
MutableLexicon
def ++=(source: Source): WordLexicon.this.type

All a lines from the input Source to this lexicon.
All a lines from the input Source to this lexicon. Source is assumed to contain multiple newline-separated lexicon entries

Definition Classes
MutableLexicon
def +=(phrase: String): Unit

Tokenize and lemmatize the input String and add it as a single entry to the Lexicon
Tokenize and lemmatize the input String and add it as a single entry to the Lexicon

Definition Classes
WordLexicon → MutableLexicon
final def ==(arg0: AnyRef): Boolean

Definition Classes
AnyRef
final def ==(arg0: Any): Boolean

Definition Classes
Any
final def asInstanceOf[T0]: T0

Definition Classes
Any
def clone(): AnyRef

Attributes
protected[java.lang]
Definition Classes
AnyRef
Annotations
@throws( ... )
def contains[T <: Observation[T]](query: Seq[T]): Boolean

Definition Classes
WordLexicon → Lexicon
def contains[T <: Observation[T]](query: T): Boolean

Is this Token (or more generally Observation) a member of a phrase in the lexicon (including single-word phrases)? The query.
Is this Token (or more generally Observation) a member of a phrase in the lexicon (including single-word phrases)? The query.string will be processed by the lemmatizer. For example if query.string is "New" and query.next.string is "York" and the two-word phrase "New York" is in the lexicon, then this method will return true. But if query.next.string is "shoes" (and "New shoes" is not in the lexicon) this method will return false.

Definition Classes
WordLexicon → Lexicon
def contains(untokenizedString: String): Boolean

Is the input String in the lexicon.
Is the input String in the lexicon. The input is tokenized and lemmatized; if the tokenizer indicates that it is a multi-word phrase, it will be processed by containsWords, otherwise containsWord.

Definition Classes
Lexicon
def contains(span: TokenSpan): Boolean

Definition Classes
Lexicon
final def containsLemmatizedWord(word: String): Boolean

Is this single word in the lexicon? The input String will not be processed by tokenizer, but will be processed by the lemmatizer.
Is this single word in the lexicon? The input String will not be processed by tokenizer, but will be processed by the lemmatizer.

Definition Classes
WordLexicon → Lexicon
def containsLemmatizedWords(words: Seq[String]): Boolean

Is the pre-tokenized sequence of words in the lexicon? The input words are expected to already be processed by the lemmatizer.
Is the pre-tokenized sequence of words in the lexicon? The input words are expected to already be processed by the lemmatizer.

Definition Classes
WordLexicon → Lexicon
def containsWord(word: String): Boolean

Is this single word in the lexicon? The input String will not be processed by tokenizer, but will be processed by the lemmatizer.
Is this single word in the lexicon? The input String will not be processed by tokenizer, but will be processed by the lemmatizer.

Definition Classes
Lexicon
def containsWords(words: Seq[String]): Boolean

Is the pre-tokenized sequence of words in the lexicon? Each of the input words will be processed by the lemmatizer.
Is the pre-tokenized sequence of words in the lexicon? Each of the input words will be processed by the lemmatizer.

Definition Classes
Lexicon
val contents: HashSet[String]
final def eq(arg0: AnyRef): Boolean

Definition Classes
AnyRef
def equals(arg0: Any): Boolean

Definition Classes
AnyRef → Any
def finalize(): Unit

Attributes
protected[java.lang]
Definition Classes
AnyRef
Annotations
@throws( classOf[java.lang.Throwable] )
final def getClass(): Class[_]

Definition Classes
AnyRef → Any
def hashCode(): Int

Definition Classes
AnyRef → Any
final def isInstanceOf[T0]: Boolean

Definition Classes
Any
val lemmatizer: Lemmatizer

The string lemmatizer that simplifies lexicon entries and queries before searching for a match.
The string lemmatizer that simplifies lexicon entries and queries before searching for a match. For example, a common lemmatizer is one that lowercases all strings.

Definition Classes
WordLexicon → Lexicon
val name: String

An identifier for this lexicon, suitable for adding as a category to a FeatureVectorVariable[String].
An identifier for this lexicon, suitable for adding as a category to a FeatureVectorVariable[String].

Definition Classes
WordLexicon → Lexicon
final def ne(arg0: AnyRef): Boolean

Definition Classes
AnyRef
final def notify(): Unit

Definition Classes
AnyRef
final def notifyAll(): Unit

Definition Classes
AnyRef
final def synchronized[T0](arg0: ⇒ T0): T0

Definition Classes
AnyRef
def toString(): String

Definition Classes
AnyRef → Any
val tokenizer: StringSegmenter

The string segmenter that breaks a lexicon entries and queries into (potentially) multi-word phrases.
The string segmenter that breaks a lexicon entries and queries into (potentially) multi-word phrases.

Definition Classes
WordLexicon → Lexicon
final def wait(): Unit

Definition Classes
AnyRef
Annotations
@throws( ... )
final def wait(arg0: Long, arg1: Int): Unit

Definition Classes
AnyRef
Annotations
@throws( ... )
final def wait(arg0: Long): Unit

Definition Classes
AnyRef
Annotations
@throws( ... )

class WordLexicon extends MutableLexicon

Instance Constructors

new WordLexicon(name: String, tokenizer: StringSegmenter = ..., lemmatizer: Lemmatizer = ...)

Value Members

final def !=(arg0: AnyRef): Boolean

final def !=(arg0: Any): Boolean

final def ##(): Int

def ++=(file: File, enc: String = "UTF-8"): WordLexicon.this.type

def ++=(phrases: String): WordLexicon.this.type

def ++=(source: Source): WordLexicon.this.type

def +=(phrase: String): Unit

final def ==(arg0: AnyRef): Boolean

final def ==(arg0: Any): Boolean

final def asInstanceOf[T0]: T0

def clone(): AnyRef

def contains[T <: Observation[T]](query: Seq[T]): Boolean

def contains[T <: Observation[T]](query: T): Boolean

def contains(untokenizedString: String): Boolean

def contains(span: TokenSpan): Boolean

final def containsLemmatizedWord(word: String): Boolean

def containsLemmatizedWords(words: Seq[String]): Boolean

def containsWord(word: String): Boolean

def containsWords(words: Seq[String]): Boolean

val contents: HashSet[String]

final def eq(arg0: AnyRef): Boolean

def equals(arg0: Any): Boolean

def finalize(): Unit

final def getClass(): Class[_]

def hashCode(): Int

final def isInstanceOf[T0]: Boolean

val lemmatizer: Lemmatizer

val name: String

final def ne(arg0: AnyRef): Boolean

final def notify(): Unit

final def notifyAll(): Unit

final def synchronized[T0](arg0: ⇒ T0): T0

def toString(): String

val tokenizer: StringSegmenter

final def wait(): Unit

final def wait(arg0: Long, arg1: Int): Unit

final def wait(arg0: Long): Unit

Inherited from MutableLexicon

Inherited from Lexicon

Inherited from AnyRef

Inherited from Any

Ungrouped