<GRLAB>
contains two kind of lines.
The first kind are the lines defining UNIQUE
labels, which
have the format:
UNIQUE label1 label2 label3 ...
You can specify many UNIQUE
lines, each with one or more
labels. The effect is the same than having all of them in a single
line, and the order is not relevant.
Labels in UNIQUE
lists will be assigned only once per
head. That is, if a head has a daugther with a dependency already
labeled as label1
, rules assigning this label will be ignored
for all other daugthers of the same head. (e.g. if a verb has got a
subject
label for one of its dependencies, no other
dependency will get that label, even if it meets the conditions to
do so).
The second kind of lines state the rules to label the dependences extracted from the full parse tree build with the rules in previous section:
Each line contains a rule, with the format:
ancestor-label dependence-label condition1 condition2 ...
where:
ancestor-label
is the label of the node which is
head of the dependence.
dependence-label
is the label to be assigned to the dependence
condition
is a list of conditions that the dependence
has to match to satisfy the rule.
Each condition
has one of the forms:
node.attribute = value node.attribute != value
Where node
is a string describing a node on which the
attribute
has to be checked. The value
is a string
to be matched, or a set of strings (separated by ``|
''). The
strings can be right-wildcarded (e.g. np*
is allowed, but
not n*p
. For the pos
attribute, value
can be
any valid regular expression.
The node
expresses a path to locate the node to be
checked. The path must start with p
(parent node) or
d
(descendant node), and may be followed by a
colon-separated list of labels. For instance p:sn:n
refers
to the first node labeled n
found under a node labeled
sn
which is under the dependency parent p
.
The node
may be also As
(All siblings) or
Es
(Exists sibling) which will check the list of all
children of the ancestor (p
), excluding the focus daughter
(d
). As
and Es
may be followed by a path,
just like p
and d
. For instance, Es:sn:n
will check for a sibling with that path, and As:sn:n
will check that all sibling have that path.
Possible attribute to be used:
label
: chunk label (or PoS tag) of the node.
side
: (left or right) position of the specified node with respect to the other. Only valid for p
and d
.
lemma
: lemma of the node head word.
pos
: PoS tag of the node head word
class
: word class (see below) of lemma of the node head word.
tonto
: EWN Top Ontology properties of the node head word.
semfile
: WN semantic file of the node head word.
synon
: Synonym lemmas of the node head word (according to WN).
asynon
: Synonym lemmas of the node head word ancestors (according to WN).
Note that since no disambiguation is required, the attributes dealing with semantic properties will be satisfied if any of the word senses matches the condition.
For instance, the rule:
verb-phr subj d.label=np* d.side=leftstates that if a
verb-phr
node has a daughter to its left, with a label
starting by np
, this dependence is to be labeled as subj
.
Similarly, the rule:
verb-phr obj d.label=np* d:sn.tonto=Edible p.lemma=eat|gulpstates that if a
verb-phr
node has eat or gulp as
lemma, and a descendant with a label starting by np
and containing
a daughter labeled sn
that has Edible property in EWN
Top ontology, this dependence is to be labeled as obj
.
Another example:
verb-phr iobj d.label=pp* d.lemma=to|for Es.label=np*states that if a
verb-phr
has a descendant with a label
starting by pp
(prepositional phrase) and lemma to or
for, and there is another child of the same parent which is
a noun phrase (np*
), this dependence is to be
labeled as iobj
.
Yet another:
verb-phr dobj d.label=pp* d.lemma=to|for As.label!=np*states that if a
verb-phr
has a descendant with a label
starting by pp
(prepositional phrase) and lemma to or
for, and all the other children of the same parent are not
noun phrases (np*
), this dependence is to be
labeled as dobj
.
Lluís Padró 2010-09-02