Abstract class to implement a Finite-State Automaton which is used by modules recognizing multiwords (dates, numbers, quantities, . More...
#include <automat.h>
Public Member Functions | |
automat () | |
Constructor. | |
virtual | ~automat () |
Destructor. | |
void | annotate (sentence &) |
Detect patterns in sentence. | |
bool | annotate (sentence &, sentence::iterator &) |
Detect patterns starting at a specific word. | |
Protected Attributes | |
int | initialState |
state code of initial state | |
int | stopState |
state code for stop State | |
int | trans [MAX_STATES][MAX_TOKENS] |
Transition tables. | |
std::set< int > | Final |
set of final states | |
Private Member Functions | |
virtual int | ComputeToken (int, sentence::iterator &, sentence &)=0 |
pure virtual function to be provided by the child class. | |
virtual void | ResetActions ()=0 |
pure virtual function to be provided by the child class . | |
virtual void | StateActions (int, int, int, sentence::const_iterator)=0 |
pure virtual function to be provided by the child class. | |
virtual void | SetMultiwordAnalysis (sentence::iterator, int)=0 |
pure virtual function to be provided by the child class. | |
virtual bool | ValidMultiWord (const word &) |
virtual function (true by default). | |
virtual sentence::iterator | BuildMultiword (sentence &, sentence::iterator, sentence::iterator, int, bool &) |
Private function to re-arrange sentence when match found. |
Abstract class to implement a Finite-State Automaton which is used by modules recognizing multiwords (dates, numbers, quantities, .
..).
Details:
Child classes must provide a constructor that:
Child classes must provide the virtual functions:
Child classes must declare and manage any private attribute or function they may need to perform the expected computations
automat::automat | ( | ) |
Constructor.
Create an instance of the class, initializing options member.
Since automat is an abstract class, this is called always from child constructors.
virtual automat::~automat | ( | ) | [inline, virtual] |
Destructor.
bool automat::annotate | ( | sentence & | se, | |
sentence::iterator & | i | |||
) |
Detect patterns starting at a specific word.
Check given word in sentece as a possible pattern start.
Recognize the longest pattern.
References BuildMultiword(), ComputeToken(), Final, initialState, ResetActions(), StateActions(), stopState, TRACE, TRACE_SENTENCE, and trans.
void automat::annotate | ( | sentence & | se | ) |
Detect patterns in sentence.
Check each word in sentece as a possible pattern start.
Recognize the longest pattern starting at first possible start found. Repeat the process starting from first word after recognized pattern, until sentence ends.
Reimplemented in np.
References BuildMultiword(), ComputeToken(), Final, initialState, ResetActions(), StateActions(), stopState, TRACE, TRACE_SENTENCE, and trans.
Referenced by maco::analyze(), quantities::annotate(), numbers::annotate(), dates::annotate(), quantities_en::ComputeToken(), quantities_gl::ComputeToken(), quantities_ca::ComputeToken(), and quantities_es::ComputeToken().
sentence::iterator automat::BuildMultiword | ( | sentence & | se, | |
sentence::iterator | start, | |||
sentence::iterator | end, | |||
int | fs, | |||
bool & | built | |||
) | [private, virtual] |
Private function to re-arrange sentence when match found.
Arrange the sentence grouping all words from start to end in a multiword.
Reimplemented in np.
References ResetActions(), SetMultiwordAnalysis(), TRACE, and ValidMultiWord().
Referenced by annotate().
virtual int automat::ComputeToken | ( | int | , | |
sentence::iterator & | , | |||
sentence & | ||||
) | [private, pure virtual] |
pure virtual function to be provided by the child class.
Computes token code for current word in current state.
Implemented in dates_default, dates_es, dates_ca, dates_en, locutions, np, numbers_default, numbers_es, numbers_ca, numbers_gl, numbers_it, numbers_en, quantities_default, quantities_es, quantities_ca, quantities_gl, and quantities_en.
Referenced by annotate().
virtual void automat::ResetActions | ( | ) | [private, pure virtual] |
pure virtual function to be provided by the child class .
Resets automaton internal variables when a new search is started.
Implemented in dates_default, dates_es, dates_ca, dates_en, locutions, np, numbers_default, numbers_es, numbers_ca, numbers_gl, numbers_it, numbers_en, quantities_default, quantities_es, quantities_ca, quantities_gl, and quantities_en.
Referenced by annotate(), and BuildMultiword().
virtual void automat::SetMultiwordAnalysis | ( | sentence::iterator | , | |
int | ||||
) | [private, pure virtual] |
pure virtual function to be provided by the child class.
Sets analysis for pattern identified as a multiword.
Implemented in dates_default, dates_es, dates_ca, dates_en, locutions, np, numbers_default, numbers_es, numbers_ca, numbers_gl, numbers_it, numbers_en, quantities_default, quantities_es, quantities_ca, quantities_gl, and quantities_en.
Referenced by BuildMultiword().
virtual void automat::StateActions | ( | int | , | |
int | , | |||
int | , | |||
sentence::const_iterator | ||||
) | [private, pure virtual] |
pure virtual function to be provided by the child class.
Performs appropriate internal actions, given origin and destinanation states, token code and word.
Implemented in dates_default, dates_es, dates_ca, dates_en, locutions, np, numbers_default, numbers_es, numbers_ca, numbers_gl, numbers_it, numbers_en, quantities_default, quantities_es, quantities_ca, quantities_gl, and quantities_en.
Referenced by annotate().
bool automat::ValidMultiWord | ( | const word & | w | ) | [private, virtual] |
virtual function (true by default).
Perform last minute validation before effectively building multiword.
Allows the child class to perform a last-minute check before effectively building the multiword.
Child classes can redefine this function to perform desired checks.
Reimplemented in locutions, and np.
Referenced by BuildMultiword().
std::set<int> automat::Final [protected] |
set of final states
Referenced by annotate(), dates_ca::dates_ca(), dates_default::dates_default(), dates_en::dates_en(), dates_es::dates_es(), locutions::locutions(), np::np(), numbers_ca::numbers_ca(), numbers_default::numbers_default(), numbers_en::numbers_en(), numbers_es::numbers_es(), numbers_gl::numbers_gl(), numbers_it::numbers_it(), quantities_ca::quantities_ca(), quantities_default::quantities_default(), quantities_en::quantities_en(), quantities_es::quantities_es(), and quantities_gl::quantities_gl().
int automat::initialState [protected] |
state code of initial state
Referenced by annotate(), dates_ca::dates_ca(), dates_default::dates_default(), dates_en::dates_en(), dates_es::dates_es(), locutions::locutions(), np::np(), numbers_ca::numbers_ca(), numbers_default::numbers_default(), numbers_en::numbers_en(), numbers_es::numbers_es(), numbers_gl::numbers_gl(), numbers_it::numbers_it(), quantities_ca::quantities_ca(), quantities_default::quantities_default(), quantities_en::quantities_en(), quantities_es::quantities_es(), and quantities_gl::quantities_gl().
int automat::stopState [protected] |
state code for stop State
Referenced by annotate(), dates_ca::dates_ca(), dates_default::dates_default(), dates_en::dates_en(), dates_es::dates_es(), locutions::locutions(), np::np(), numbers_ca::numbers_ca(), numbers_default::numbers_default(), numbers_en::numbers_en(), numbers_es::numbers_es(), numbers_gl::numbers_gl(), numbers_it::numbers_it(), quantities_ca::quantities_ca(), quantities_default::quantities_default(), quantities_en::quantities_en(), quantities_es::quantities_es(), and quantities_gl::quantities_gl().
int automat::trans[MAX_STATES][MAX_TOKENS] [protected] |
Transition tables.
Referenced by annotate(), numbers_it::ComputeToken(), dates_ca::dates_ca(), dates_default::dates_default(), dates_en::dates_en(), dates_es::dates_es(), locutions::locutions(), np::np(), numbers_ca::numbers_ca(), numbers_default::numbers_default(), numbers_en::numbers_en(), numbers_es::numbers_es(), numbers_gl::numbers_gl(), numbers_it::numbers_it(), quantities_ca::quantities_ca(), quantities_default::quantities_default(), quantities_en::quantities_en(), quantities_es::quantities_es(), and quantities_gl::quantities_gl().