Class suffixes implements suffixation rules and dictionary search for suffixed word forms. More...
#include <suffixes.h>
Public Member Functions | |
affixes (const std::string &, const std::string &) | |
Constructor. | |
void | look_for_affixes (word &, dictionary &) |
look up possible roots of a suffixed/prefixed form | |
Private Member Functions | |
void | look_for_affixes_in_list (int, std::multimap< std::string, sufrule > &, word &, dictionary &) const |
find all applicable affix rules for a word | |
void | look_for_combined_affixes (std::multimap< std::string, sufrule > &, std::multimap< std::string, sufrule > &, word &, dictionary &) const |
find all applicable prefix+sufix rules combination for a word | |
std::set< std::string > | GenerateRoots (int, const sufrule &, const std::string &) const |
generate roots according to rules. | |
void | SearchRootsList (std::set< std::string > &, const std::string &, sufrule &, word &, dictionary &) const |
find roots in dictionary and apply matching rules | |
void | ApplyRule (const std::string &, const std::list< analysis > &, const std::string &, sufrule &, word &, dictionary &) const |
actually apply a affix rule | |
void | CheckRetokenizable (const sufrule &, const std::string &, const std::string &, const std::string &, dictionary &, std::list< word > &) const |
auxiliary method to deal with retokenization | |
Private Attributes | |
accents | accen |
Language-specific accent handler. | |
std::multimap< std::string, sufrule > | affix [2] |
all suffixation/prefixation rules | |
std::multimap< std::string, sufrule > | affix_always [2] |
suffixation/prefixation rules applied unconditionally | |
std::set< unsigned int > | ExistingLength [2] |
index of existing suffix/prefixs lengths. | |
unsigned int | Longest [2] |
Length of longest suffix/prefix. |
Class suffixes implements suffixation rules and dictionary search for suffixed word forms.
affixes::affixes | ( | const std::string & | Lang, | |
const std::string & | sufFile | |||
) |
Constructor.
Create a suffixed words analyzer.
References sufrule::acc, affix, affix_always, sufrule::always, sufrule::enc, ERROR_CRASH, ExistingLength, sufrule::lema, Longest, sufrule::nomore, sufrule::output, PREF, sufrule::retok, SUF, sufrule::term, and TRACE.
void affixes::ApplyRule | ( | const std::string & | , | |
const std::list< analysis > & | , | |||
const std::string & | , | |||
sufrule & | , | |||
word & | , | |||
dictionary & | ||||
) | const [private] |
actually apply a affix rule
Referenced by look_for_combined_affixes().
void affixes::CheckRetokenizable | ( | const sufrule & | suf, | |
const std::string & | form, | |||
const std::string & | lem, | |||
const std::string & | tag, | |||
dictionary & | dic, | |||
std::list< word > & | rtk | |||
) | const [private] |
auxiliary method to deal with retokenization
Check whether the suffix carries retokenization information, and create alternative word list if necessary.
References sufrule::retok, dictionary::search_form(), and TRACE.
set< string > affixes::GenerateRoots | ( | int | kind, | |
const sufrule & | suf, | |||
const std::string & | rt | |||
) | const [private] |
generate roots according to rules.
Generate all possible forms expanding root rt with all possible terminations according to the given suffix rule.
References PREF, SUF, sufrule::term, and TRACE.
Referenced by look_for_affixes_in_list(), and look_for_combined_affixes().
void affixes::look_for_affixes | ( | word & | w, | |
dictionary & | dic | |||
) |
look up possible roots of a suffixed/prefixed form
Look up possible roots of a suffixed form.
Words already analyzed are only applied the "always"-marked suffix rules. So-far unrecognized words, are applied all the sufix rules.
References affix, affix_always, look_for_affixes_in_list(), look_for_combined_affixes(), PREF, SUF, and TRACE.
Referenced by dictionary::annotate_word().
void affixes::look_for_affixes_in_list | ( | int | kind, | |
std::multimap< std::string, sufrule > & | suff, | |||
word & | w, | |||
dictionary & | dic | |||
) | const [private] |
find all applicable affix rules for a word
References accen, ExistingLength, accents::fix_accentuation(), GenerateRoots(), Longest, PREF, SearchRootsList(), SUF, and TRACE.
Referenced by look_for_affixes().
void affixes::look_for_combined_affixes | ( | std::multimap< std::string, sufrule > & | suff, | |
std::multimap< std::string, sufrule > & | pref, | |||
word & | w, | |||
dictionary & | dic | |||
) | const [private] |
find all applicable prefix+sufix rules combination for a word
References accen, ApplyRule(), ExistingLength, accents::fix_accentuation(), GenerateRoots(), Longest, PREF, SearchRootsList(), SUF, and TRACE.
Referenced by look_for_affixes().
void affixes::SearchRootsList | ( | std::set< std::string > & | , | |
const std::string & | , | |||
sufrule & | , | |||
word & | , | |||
dictionary & | ||||
) | const [private] |
find roots in dictionary and apply matching rules
Referenced by look_for_affixes_in_list(), and look_for_combined_affixes().
accents affixes::accen [private] |
Language-specific accent handler.
Referenced by look_for_affixes_in_list(), and look_for_combined_affixes().
std::multimap<std::string,sufrule> affixes::affix[2] [private] |
all suffixation/prefixation rules
Referenced by affixes(), and look_for_affixes().
std::multimap<std::string,sufrule> affixes::affix_always[2] [private] |
suffixation/prefixation rules applied unconditionally
Referenced by affixes(), and look_for_affixes().
std::set<unsigned int> affixes::ExistingLength[2] [private] |
index of existing suffix/prefixs lengths.
Referenced by affixes(), look_for_affixes_in_list(), and look_for_combined_affixes().
unsigned int affixes::Longest[2] [private] |
Length of longest suffix/prefix.
Referenced by affixes(), look_for_affixes_in_list(), and look_for_combined_affixes().