The splitter module receives lists of word objects (either produced by the tokenizer or by any other means in the calling application) and buffers them until a sentence boundary is detected. Then, a list of sentence objects is returned.
The buffer of the splitter may retain part of the tokens if the given list didn't end with a clear sentence boundary. The caller application can sumbit further token lists to be added, or request the splitter to flush the buffer.
The API for the splitter class is:
class splitter { public: /// Constructor. Receives a file with the desired options splitter(const std::string &); /// Add list of words to the buffer, and return complete sentences /// that can be build. /// The boolean states if a buffer flush has to be forced (true) or /// some words may remain in the buffer (false) if the splitter /// wants to wait to see what is coming next. std::list<sentence> split(const std::list<word> &, bool); };