cc.factorie.app.nlp.lexicon

WordLexicon

class WordLexicon extends MutableLexicon

A Lexicon that can only hold single-word lexicon entries, but which is efficient for this case. with methods to check whether a String or Token (or more generally a cc.factorie.app.chain.Observation) is in the list.

Linear Supertypes
Known Subclasses
Ordering
  1. Alphabetic
  2. By inheritance
Inherited
  1. WordLexicon
  2. MutableLexicon
  3. Lexicon
  4. AnyRef
  5. Any
  1. Hide All
  2. Show all
Learn more about member selection
Visibility
  1. Public
  2. All

Instance Constructors

  1. new WordLexicon(name: String, tokenizer: StringSegmenter = ..., lemmatizer: Lemmatizer = ...)

Value Members

  1. final def !=(arg0: AnyRef): Boolean

    Definition Classes
    AnyRef
  2. final def !=(arg0: Any): Boolean

    Definition Classes
    Any
  3. final def ##(): Int

    Definition Classes
    AnyRef → Any
  4. def ++=(file: File, enc: String = "UTF-8"): WordLexicon.this.type

    All a lines from the input File to this lexicon.

    All a lines from the input File to this lexicon. File contains multiple newline-separated lexicon entries

    Definition Classes
    MutableLexicon
  5. def ++=(phrases: String): WordLexicon.this.type

    All a lines from the input String to this lexicon.

    All a lines from the input String to this lexicon. String contains multiple newline-separated lexicon entries

    Definition Classes
    MutableLexicon
  6. def ++=(source: Source): WordLexicon.this.type

    All a lines from the input Source to this lexicon.

    All a lines from the input Source to this lexicon. Source is assumed to contain multiple newline-separated lexicon entries

    Definition Classes
    MutableLexicon
  7. def +=(phrase: String): Unit

    Tokenize and lemmatize the input String and add it as a single entry to the Lexicon

    Tokenize and lemmatize the input String and add it as a single entry to the Lexicon

    Definition Classes
    WordLexiconMutableLexicon
  8. final def ==(arg0: AnyRef): Boolean

    Definition Classes
    AnyRef
  9. final def ==(arg0: Any): Boolean

    Definition Classes
    Any
  10. final def asInstanceOf[T0]: T0

    Definition Classes
    Any
  11. def clone(): AnyRef

    Attributes
    protected[java.lang]
    Definition Classes
    AnyRef
    Annotations
    @throws( ... )
  12. def contains[T <: Observation[T]](query: Seq[T]): Boolean

    Definition Classes
    WordLexiconLexicon
  13. def contains[T <: Observation[T]](query: T): Boolean

    Is this Token (or more generally Observation) a member of a phrase in the lexicon (including single-word phrases)? The query.

    Is this Token (or more generally Observation) a member of a phrase in the lexicon (including single-word phrases)? The query.string will be processed by the lemmatizer. For example if query.string is "New" and query.next.string is "York" and the two-word phrase "New York" is in the lexicon, then this method will return true. But if query.next.string is "shoes" (and "New shoes" is not in the lexicon) this method will return false.

    Definition Classes
    WordLexiconLexicon
  14. def contains(untokenizedString: String): Boolean

    Is the input String in the lexicon.

    Is the input String in the lexicon. The input is tokenized and lemmatized; if the tokenizer indicates that it is a multi-word phrase, it will be processed by containsWords, otherwise containsWord.

    Definition Classes
    Lexicon
  15. def contains(span: TokenSpan): Boolean

    Definition Classes
    Lexicon
  16. final def containsLemmatizedWord(word: String): Boolean

    Is this single word in the lexicon? The input String will not be processed by tokenizer, but will be processed by the lemmatizer.

    Is this single word in the lexicon? The input String will not be processed by tokenizer, but will be processed by the lemmatizer.

    Definition Classes
    WordLexiconLexicon
  17. def containsLemmatizedWords(words: Seq[String]): Boolean

    Is the pre-tokenized sequence of words in the lexicon? The input words are expected to already be processed by the lemmatizer.

    Is the pre-tokenized sequence of words in the lexicon? The input words are expected to already be processed by the lemmatizer.

    Definition Classes
    WordLexiconLexicon
  18. def containsWord(word: String): Boolean

    Is this single word in the lexicon? The input String will not be processed by tokenizer, but will be processed by the lemmatizer.

    Is this single word in the lexicon? The input String will not be processed by tokenizer, but will be processed by the lemmatizer.

    Definition Classes
    Lexicon
  19. def containsWords(words: Seq[String]): Boolean

    Is the pre-tokenized sequence of words in the lexicon? Each of the input words will be processed by the lemmatizer.

    Is the pre-tokenized sequence of words in the lexicon? Each of the input words will be processed by the lemmatizer.

    Definition Classes
    Lexicon
  20. val contents: HashSet[String]

  21. final def eq(arg0: AnyRef): Boolean

    Definition Classes
    AnyRef
  22. def equals(arg0: Any): Boolean

    Definition Classes
    AnyRef → Any
  23. def finalize(): Unit

    Attributes
    protected[java.lang]
    Definition Classes
    AnyRef
    Annotations
    @throws( classOf[java.lang.Throwable] )
  24. final def getClass(): Class[_]

    Definition Classes
    AnyRef → Any
  25. def hashCode(): Int

    Definition Classes
    AnyRef → Any
  26. final def isInstanceOf[T0]: Boolean

    Definition Classes
    Any
  27. val lemmatizer: Lemmatizer

    The string lemmatizer that simplifies lexicon entries and queries before searching for a match.

    The string lemmatizer that simplifies lexicon entries and queries before searching for a match. For example, a common lemmatizer is one that lowercases all strings.

    Definition Classes
    WordLexiconLexicon
  28. val name: String

    An identifier for this lexicon, suitable for adding as a category to a FeatureVectorVariable[String].

    An identifier for this lexicon, suitable for adding as a category to a FeatureVectorVariable[String].

    Definition Classes
    WordLexiconLexicon
  29. final def ne(arg0: AnyRef): Boolean

    Definition Classes
    AnyRef
  30. final def notify(): Unit

    Definition Classes
    AnyRef
  31. final def notifyAll(): Unit

    Definition Classes
    AnyRef
  32. final def synchronized[T0](arg0: ⇒ T0): T0

    Definition Classes
    AnyRef
  33. def toString(): String

    Definition Classes
    AnyRef → Any
  34. val tokenizer: StringSegmenter

    The string segmenter that breaks a lexicon entries and queries into (potentially) multi-word phrases.

    The string segmenter that breaks a lexicon entries and queries into (potentially) multi-word phrases.

    Definition Classes
    WordLexiconLexicon
  35. final def wait(): Unit

    Definition Classes
    AnyRef
    Annotations
    @throws( ... )
  36. final def wait(arg0: Long, arg1: Int): Unit

    Definition Classes
    AnyRef
    Annotations
    @throws( ... )
  37. final def wait(arg0: Long): Unit

    Definition Classes
    AnyRef
    Annotations
    @throws( ... )

Inherited from MutableLexicon

Inherited from Lexicon

Inherited from AnyRef

Inherited from Any

Ungrouped