PhraseLexicon

Instance Constructors

new PhraseLexicon(file: File)

Populate lexicon from file, with one entry per line, consisting of space-separated tokens.
new PhraseLexicon(name: String, tokenizer: StringSegmenter = ..., lemmatizer: Lemmatizer = ...)

Type Members

class LexiconPhraseToken extends LexiconToken
class LexiconToken extends Observation[LexiconToken]

Value Members

final def !=(arg0: AnyRef): Boolean

Definition Classes
AnyRef
final def !=(arg0: Any): Boolean

Definition Classes
Any
final def ##(): Int

Definition Classes
AnyRef → Any
def ++=(file: File, enc: String = "UTF-8"): PhraseLexicon.this.type

All a lines from the input File to this lexicon.
All a lines from the input File to this lexicon. File contains multiple newline-separated lexicon entries

Definition Classes
MutableLexicon
def ++=(phrases: String): PhraseLexicon.this.type

All a lines from the input String to this lexicon.
All a lines from the input String to this lexicon. String contains multiple newline-separated lexicon entries

Definition Classes
MutableLexicon
def ++=(source: Source): PhraseLexicon.this.type

All a lines from the input Source to this lexicon.
All a lines from the input Source to this lexicon. Source is assumed to contain multiple newline-separated lexicon entries

Definition Classes
MutableLexicon
def +=(phrase: String): Unit

Add a new lexicon entry consisting of one or more words.
Add a new lexicon entry consisting of one or more words. The Lexicon's tokenizer will be used to split the string, if possible.

Definition Classes
PhraseLexicon → MutableLexicon
final def ==(arg0: AnyRef): Boolean

Definition Classes
AnyRef
final def ==(arg0: Any): Boolean

Definition Classes
Any
object LexiconToken extends LexiconToken
final def asInstanceOf[T0]: T0

Definition Classes
Any
def clone(): AnyRef

Attributes
protected[java.lang]
Definition Classes
AnyRef
Annotations
@throws( ... )
def contains[T <: Observation[T]](query: T): Boolean

Is 'query' in the lexicon, accounting for lexicon phrases and the context of 'query'
Is 'query' in the lexicon, accounting for lexicon phrases and the context of 'query'

Definition Classes
PhraseLexicon → Lexicon
def contains[T <: Observation[T]](query: Seq[T]): Boolean

Definition Classes
PhraseLexicon → Lexicon
def contains(untokenizedString: String): Boolean

Is the input String in the lexicon.
Is the input String in the lexicon. The input is tokenized and lemmatized; if the tokenizer indicates that it is a multi-word phrase, it will be processed by containsWords, otherwise containsWord.

Definition Classes
Lexicon
def contains(span: TokenSpan): Boolean

Definition Classes
Lexicon
def containsLemmatizedWord(word: String): Boolean

Do any of the Lexicon entries contain the given word string.
Do any of the Lexicon entries contain the given word string.

Definition Classes
PhraseLexicon → Lexicon
def containsLemmatizedWords(words: Seq[String]): Boolean

Is the pre-tokenized sequence of words in the lexicon? The input words are expected to already be processed by the lemmatizer.
Is the pre-tokenized sequence of words in the lexicon? The input words are expected to already be processed by the lemmatizer.

Definition Classes
PhraseLexicon → Lexicon
def containsSingle[T <: Observation[T]](query: T): Boolean

Is 'query' in the lexicon, ignoring context.
def containsWord(word: String): Boolean

Is this single word in the lexicon? The input String will not be processed by tokenizer, but will be processed by the lemmatizer.
Is this single word in the lexicon? The input String will not be processed by tokenizer, but will be processed by the lemmatizer.

Definition Classes
Lexicon
def containsWords(words: Seq[String]): Boolean

Is the pre-tokenized sequence of words in the lexicon? Each of the input words will be processed by the lemmatizer.
Is the pre-tokenized sequence of words in the lexicon? Each of the input words will be processed by the lemmatizer.

Definition Classes
Lexicon
val contents: HashMap[String, List[LexiconToken]]
final def eq(arg0: AnyRef): Boolean

Definition Classes
AnyRef
def equals(arg0: Any): Boolean

Definition Classes
AnyRef → Any
def finalize(): Unit

Attributes
protected[java.lang]
Definition Classes
AnyRef
Annotations
@throws( classOf[java.lang.Throwable] )
final def getClass(): Class[_]

Definition Classes
AnyRef → Any
def hashCode(): Int

Definition Classes
AnyRef → Any
final def isInstanceOf[T0]: Boolean

Definition Classes
Any
val lemmatizer: Lemmatizer

The string lemmatizer that simplifies lexicon entries and queries before searching for a match.
The string lemmatizer that simplifies lexicon entries and queries before searching for a match. For example, a common lemmatizer is one that lowercases all strings.

Definition Classes
PhraseLexicon → Lexicon
val name: String

An identifier for this lexicon, suitable for adding as a category to a FeatureVectorVariable[String].
An identifier for this lexicon, suitable for adding as a category to a FeatureVectorVariable[String].

Definition Classes
PhraseLexicon → Lexicon
final def ne(arg0: AnyRef): Boolean

Definition Classes
AnyRef
final def notify(): Unit

Definition Classes
AnyRef
final def notifyAll(): Unit

Definition Classes
AnyRef
def phrases: Seq[String]

String contains multiple newline-separated lexicon entries
def startsAt[T <: Observation[T]](query: T): Int

Return length of match, or -1 if no match.
final def synchronized[T0](arg0: ⇒ T0): T0

Definition Classes
AnyRef
def toString(): String

Definition Classes
AnyRef → Any
val tokenizer: StringSegmenter

The string segmenter that breaks a lexicon entries and queries into (potentially) multi-word phrases.
The string segmenter that breaks a lexicon entries and queries into (potentially) multi-word phrases.

Definition Classes
PhraseLexicon → Lexicon
final def wait(): Unit

Definition Classes
AnyRef
Annotations
@throws( ... )
final def wait(arg0: Long, arg1: Int): Unit

Definition Classes
AnyRef
Annotations
@throws( ... )
final def wait(arg0: Long): Unit

Definition Classes
AnyRef
Annotations
@throws( ... )

class PhraseLexicon extends MutableLexicon

Instance Constructors

new PhraseLexicon(file: File)

new PhraseLexicon(name: String, tokenizer: StringSegmenter = ..., lemmatizer: Lemmatizer = ...)

Type Members

class LexiconPhraseToken extends LexiconToken

class LexiconToken extends Observation[LexiconToken]

Value Members

final def !=(arg0: AnyRef): Boolean

final def !=(arg0: Any): Boolean

final def ##(): Int

def ++=(file: File, enc: String = "UTF-8"): PhraseLexicon.this.type

def ++=(phrases: String): PhraseLexicon.this.type

def ++=(source: Source): PhraseLexicon.this.type

def +=(phrase: String): Unit

final def ==(arg0: AnyRef): Boolean

final def ==(arg0: Any): Boolean

object LexiconToken extends LexiconToken

final def asInstanceOf[T0]: T0

def clone(): AnyRef

def contains[T <: Observation[T]](query: T): Boolean

def contains[T <: Observation[T]](query: Seq[T]): Boolean

def contains(untokenizedString: String): Boolean

def contains(span: TokenSpan): Boolean

def containsLemmatizedWord(word: String): Boolean

def containsLemmatizedWords(words: Seq[String]): Boolean

def containsSingle[T <: Observation[T]](query: T): Boolean

def containsWord(word: String): Boolean

def containsWords(words: Seq[String]): Boolean

val contents: HashMap[String, List[LexiconToken]]

final def eq(arg0: AnyRef): Boolean

def equals(arg0: Any): Boolean

def finalize(): Unit

final def getClass(): Class[_]

def hashCode(): Int

final def isInstanceOf[T0]: Boolean

val lemmatizer: Lemmatizer

val name: String

final def ne(arg0: AnyRef): Boolean

final def notify(): Unit

final def notifyAll(): Unit

def phrases: Seq[String]

def startsAt[T <: Observation[T]](query: T): Int

final def synchronized[T0](arg0: ⇒ T0): T0

def toString(): String

val tokenizer: StringSegmenter

final def wait(): Unit

final def wait(arg0: Long, arg1: Int): Unit

final def wait(arg0: Long): Unit

Inherited from MutableLexicon

Inherited from Lexicon

Inherited from AnyRef

Inherited from Any

Ungrouped