Multiword Recognition Module

This module aggregates input tokens in a single word object if they are found in a given list of multiwords.

The API for this class is:

class automat {
 public:
      /// Constructor
      automat();

      /// Detect patterns in given sentence
      void annotate(sentence &);
};

class locutions: public automat {
   public:
      /// Constructor, receives the name of the file
      ///  containing the multiwords to recognize.
      locutions(const std::string &);
};

Class automat implements a generic FSA. The locutions class is a derived class which implements a FSA to recognize the word patters listed in the file given to the constructor.



Subsections

Lluís Padró 2010-09-02