The number detection module is language dependent: It recognizes nummerical expression (e.g.: 1,220.54 or two-hundred sixty-five), and assigns them a normalized value as lemma.
The module is basically a finite-state automata that recognizes valid nummerical expressions. Since the structure of the automata and the actions to compute the actual nummerical value are different for each lemma.
For languages that do not have an implementation of a specific automata, a generic module is used to recognize number-like expressions that contain nummerical digits.
For the reasons described so far, there is no options or configuration file to be provided to the class when it is instantiated. The API of the class is:
class numbers { public: /// Constructor: receives the language code, and the decimal /// and thousand point symbols numbers(const std::string &, const std::string &, const std::string &); /// Detect number expressions in given sentence void annotate(sentence &); };
The parameters that the constructor expects are:
Lluís Padró 2010-09-02