The set of phrases to be picked.
The mode. If ADD_SEPARATELY the new sections are only added to the attribute. If ADD_TO_SECTIONS the new sections are added to the document. IF REPLACE_SECTIONS the existing sections in the document are replaced.
How the annotation of this DocumentAnnotator should be printed as extra information after a one-word-per-line (OWPL) format.
How the annotation of this DocumentAnnotator should be printed as extra information after a one-word-per-line (OWPL) format. If there is no document annotation, return the empty string. Used in Document.owplString.
The mode.
The mode. If ADD_SEPARATELY the new sections are only added to the attribute. If ADD_TO_SECTIONS the new sections are added to the document. IF REPLACE_SECTIONS the existing sections in the document are replaced.
How the annotation of this DocumentAnnotator should be printed in one-word-per-line (OWPL) format.
How the annotation of this DocumentAnnotator should be printed in one-word-per-line (OWPL) format. If there is no per-token annotation, return null. Used in Document.owplString.
A tokenizer which will merge existing tokens if they are from one of the phrases given.
Efficiently uses a trie-like data structure to simulate the finite automaton for tokenization. The behavior is that if there is a long and a short phrase with the same prefix the longer one will be picked greedily.
This version gets all attributes from the last token in the phrase.