Linguistic Data Classes
The different processing modules work on objects containing
linguistic data (such as a word, a PoS tag, a sentence...).
Your application must be aware of those classes in order to
be able to provide to each processing module the right data,
and to correctly interpret the module results.
The Linguistic Data classes are defined in libfries library. Refer
to the documentation in that library for the details on the classes.
The linguistic classes are:
- analysis: A tuple
<lemma, PoS tag, probability, sense list>
- word: A word form with a list of possible analysis.
- sentence: A list of words known to be a complete
sentence. A sentence may have associated a parse_tree object and a dependency_tree.
- parse_tree: An n-ary tree where each node contains
either a non-terminal label, or -if the node is a leaf- a pointer
to the appropriate word object in the sentence the tree
belongs to.
- dep_tree: An n-ary tree where each node contains a
reference to a node in a parse_tree. The structure of the dep_tree
establishes syntactic dependency relationships between sentence constituents.
Lluís Padró
2010-09-02