package segment

  1. Public
  2. All

Type Members

  1. class BigramStatistics extends AnyRef

  2. class CUHChainChineseWordSegmenter extends ChainChineseWordSegmenter

  3. class ChainChineseWordSegmenter extends DocumentAnnotator

  4. class DehyphenatingTokenizer[T <: DocumentAnnotator] extends DocumentAnnotator

    concatenates words split by hyphens in the original text based on user-provided dictionary or other words in the same document.

  5. class DeterministicSentenceSegmenter extends DocumentAnnotator

    Segments a sequence of tokens into sentences.

  6. class DeterministicTokenizer extends DocumentAnnotator

    Split a String into a sequence of Tokens.

  7. class MSRChainChineseWordSegmenter extends ChainChineseWordSegmenter

    A linear-chain CRF model for Chinese word segmentation with four companion objects, each pre-trained on a different corpus that corresponds to a different variety of written Mandarin.

  8. class OntonotesNormalizedTokenString extends PlainNormalizedTokenString

  9. class PhraseSectionList extends ArrayBuffer[Section]

    A sequence of sections which are tokenized as phrases.

  10. class PhraseTokenizer extends DocumentAnnotator

    A tokenizer which will merge existing tokens if they are from one of the phrases given.

  11. class PhraseTrie extends AnyRef

  12. class PlainNormalizedTokenString extends TokenString

  13. class PunktTokenizer extends DocumentAnnotator

  14. abstract class SegmentationLabelDomain extends CategoricalDomain[String] with SegmentedCorpusLabeling

  15. trait SegmentedCorpusLabeling extends AnyRef

  16. sealed trait SentenceBoundaryInference extends AnyRef

  17. class TokenNormalizer1[A <: TokenString] extends DocumentAnnotator

    Clean up Token.

  18. sealed trait TokenType extends AnyRef

Value Members

  1. object A extends TokenType with Product with Serializable

  2. object AS extends TokenType with Product with Serializable

  3. object BIOSegmentationDomain extends SegmentationLabelDomain

  4. object BritishToAmerican extends HashMap[String, String]

  5. object CUHChainChineseWordSegmenter extends CUHChainChineseWordSegmenter

  6. object DefaultRules

  7. object DeterministicSentenceSegmenter extends DeterministicSentenceSegmenter

  8. object DeterministicTokenizer extends DeterministicTokenizer

  9. object JointlyAcrossDocuments extends SentenceBoundaryInference with Product with Serializable

  10. object MSRChainChineseWordSegmenter extends MSRChainChineseWordSegmenter

  11. object Non extends SentenceBoundaryInference with Product with Serializable

  12. object OntonotesTokenNormalizer extends TokenNormalizer1[OntonotesNormalizedTokenString]

  13. object PerDocument extends SentenceBoundaryInference with Product with Serializable

  14. object PhraseTokenizerModes extends Enumeration

  15. object PlainTokenNormalizer extends TokenNormalizer1[PlainNormalizedTokenString]

  16. object PunktSentenceSegmenter

  17. object PunktTokenizer extends PunktTokenizer

  18. object S extends TokenType with Product with Serializable

  19. object U extends TokenType with Product with Serializable