L-rule
(→Examples) |
|||
Line 80: | Line 80: | ||
|} | |} | ||
− | == | + | == Transformations == |
+ | {{:T-rules#Transformations_over_nodes}} | ||
+ | |||
+ | |||
+ | |||
+ | == Properties == | ||
;L-rules are recursive<nowiki>:</nowiki> rules will apply while conditions are true: | ;L-rules are recursive<nowiki>:</nowiki> rules will apply while conditions are true: | ||
:The rule "(BLK):=("-");" will transform "a b c d e" into "a-b-c-d-e" (and not only in "a-b c d e") | :The rule "(BLK):=("-");" will transform "a b c d e" into "a-b-c-d-e" (and not only in "a-b c d e") |
Revision as of 21:09, 20 August 2013
L-rule (linear rule) is a specific type of transformation rule used for applying transformations over ordered sequences of isolated nodes.
Contents |
When to use L-rules
L-rules are used for:
- reordering nodes in a list: a b c > a c b
- replacing nodes in a list: a b c > a x c
- adding nodes in a list: a b c > a x b c
- deleting nodes in a list: a b c > a c
When not to use L-rules
L-rules are not used in transformations over structures other than lists (i.e., trees and graphs). In these cases, we use S-rules (syntactic rules).
Syntax
The general syntax for L-rules is the following:
<CONDITION> := <ACTION>;
Where:
- <CONDITION> is a single node or a sequence of nodes over which actions will take place; and
- <ACTION> is the action to be performed over each node or sequence of nodes of the CONDITION.
Examples
RULE | BEFORE > AFTER | DESCRIPTION |
---|---|---|
("a")("b")("c"):=("d")("e")("f"); | abc > def | "a" will be replaced by "d"; "b" by "e"; and "c" by "f" |
("a")("b")("c"):=("d")( )( ); | abc > dbc | "a" will be replaced by "d"; "b" and "c" will be preserved |
("a")("b")("c"):=("d")("")(""); | abc > d | "a" will be replaced by "d"; "b" and "c" will be replaced by "" (i.e., blank) |
("a")("b")("c"):=("d",%01)(%02); | abc > db | "a" will be replaced by "d"; "b" will be preserved; "c" will be deleted |
("a")("b")("c"):=("d",%01); | abc > d | "a" will be replaced by "d"; "b" and "c" will be deleted |
("a")("b")("c"):=(%03)(%02)(%01); | abc > cba | "a", "b" and "c" will be preserved, but reordered |
("a")("b")("c"):=("d",%01)(%03); | abc > dc | "a" will be replaced by "d"; "b" will be deleted; "c" will be preserved |
("a")("b")("c"):=("d",%01)("g")(%02)(%03); | abc > dgc | "a" will be replaced by "d"; "b" and "c" will be preserved; and a new node "g" will be created between "a" and "b" |
("a",ART)(BLK)("/[aeiou].*/"):=("an")( )( ); | a adjective > an adjective | replace the article (ART) "a" by "an" before a blank space (BLK) and a node starting with "a", "e", "i", "o" or "u"; preserve the second node (BLK) and the third node without any change |
("a",PRE)(BLK)("a",ART):=("à",PRE,ART,CTC); | a a > à | replace the preposition (PRE) "a" + blank (BLK) + article (ART) "a" by "à"; add the features PRE (preposition), ART (article) and CTC (contraction) to the node "à" |
("de",PRE)(BLK)("le",ART):=("du",PRE,ART,CTC); | de le > du | replace the preposition (PRE) "de" + blank (BLK) + article (ART) "le" by "du"; add the features PRE, ART and CTC to the node "du" |
("a",VER)(BLK)("il",PPR):=( )("-t-",-BLK)( ); | a il > a-t-il | replace the blank space (BLK) between the verb (VER) "a" and the pronoun (PPR) "il" by "-t-"; remove the feature BLK from the second form; preserve the first and the third form without any change |
("de",PRE)(BLK)("/[aeiou].*/"):=("d'",%01)(%03); | de avoir > d'avoir | replace the preposition (PRE) "de" + blank space (BLK) + a node starting with "a", "e", "i", "o" or "u" by "d'"; delete the second form (BLK); and preserve the third form (%03) without any change |
Transformations
Properties
- L-rules are recursive: rules will apply while conditions are true
- The rule "(BLK):=("-");" will transform "a b c d e" into "a-b-c-d-e" (and not only in "a-b c d e")
- The rule "(X):=(+Y);" will never stop (i.e., it contains an infinite loop): the feature Y will keep been added eternally (X,Y,Y,Y,Y,Y,Y,Y,...)
- The symbol ^ is used for negation and may be used to prevent infinite loops
-
- (X,^Y):=(+Y); (= add the feature Y to a node containing the feature X that does not contain the feature Y yet)
- (^".")(STAIL):=(%01)(".")(%02); (Add a period before the end of the sentence if there is not a period yet)
- Rules are conservative. No feature is changed or deleted unless explicitly indicate through "-".
- In the rule ("x",FEA):=("y"); the string "x" is replaced by the string "y", but the feature FEA is not altered (i.e.,the final state will be ("y",FEA));
- The rule "("a",ART)(BLK)("/a[bcd]e/"):=("an")( )( );" does not affect the status of the second and the third nodes. On the other hand, the rule "("a",VER)(BLK)("il",PPR):=( )("-t-",-BLK)( );" alters the status of the second node by deleting the feature BLK.
- In the ACTION field, changes may be expressed by the right side of A-rules inside each form. The default is replacement.
- The rule "("a",ART)(BLK)("/[aeiou].*/"):=("an")( )( );" could also be expressed as "("a",ART)(BLK)("/[aeiou].*/"):=(0>"n")( )( );", i.e., the change from "a" to "an" could be expressed either by "an" or 0>"n".
- Rules apply only if all conditions are true.
- The rule "("a")(BLK)("/[aeiou].*/"):=("an")( )( );" will apply only in case of "a" before a blank and a node starting with "a", "e", "i", "o" or "u".
Indexes
Indexes are used to associate nodes in the left side of the rule (CONDITION) to nodes in the right side of the rule (ACTION):
- (%a)(%b)(%c):=(%b); (delete the first and the third nodes, and keep the second)
- (%a)(%b)(%c):=(%c)(%b)(%a); (reverse the order)
Indexation is done automatically by the machine, as follows:
- if the number of nodes is the same in the left and in the right side, NODES ARE CO-INDEXED
- ("a")("b")("c"):=("d")("e")("f"); is the same as ("a",%01)("b",%02)("c",%03):=("d",%01)("e",%02)("f",%03); (i.e., "a" will be replaced by "d", "b" by "e", and "c" by "f")
- if the number of nodes is not the same in both sides, NODES ARE NOT CO-INDEXED
- ("a")("b")("c"):=("d")("e"); is the same as ("a",%01)("b",%02)("c",%03):=("d",%04)("e",%05); (i.e., "a", "b" and "c" will be deleted, and "d" and "e" will be created
In order to avoid ambiguities, it is highly recommended that indexes are replaced by user-defined labels made of any sequence of alphabetic characters and underscore:
- (A,%a)(B,%b):=(C,%a)(D,%b);
Numeric characters cannot be used as user-defined indexes:
(A,%03)(B,%05):=(C,%03)(D,%05);
Indexes may also be used to transfer attribute values expressed in the format ATTRIBUTE=VALUE:
- (A,%a,ATT1=VAL1)(B,%b):=()(B,ATT1=%a); (the value "VAL1" of "ATT1" of %a is copied to the node %b)
Common mistakes
"Mr":="Mister";- Conditions and actions must always come between parentheses: ("Mr"):=("Mister");
(Mr):=(Mister);- Constants must come between quotes (inside the parentheses): ("Mr"):=("Mister");
("Mr"):=("Mister")- Rules must end in semicolon: ("Mr"):=("Mister");
("I am"):=("I'm");- Each separate word form must be isolated between parentheses and described as a different condition: ("I")(BLK)("am"):=("I'm");
("a",ART)(BLK)(VOW):=("an");- "a adjective">"a": the blank and the following form are deleted because they are not present at the right side
("de",PRE)(BLK)(VOW):=("d'")(VOW);- "de avoir">"d' ": coindexation is based on ordering and not on features. The third form is deleted because it's not present at the right side; the second form, which is BLK, receives the feature VOW;
Formal syntax
L-rules comply with the following formal syntax:
<L-RULE> ::= ( "("<CONDITION>")" )+ ":=" ( "("<ACTION>")" )+ ";" <CONDITION> ::= """<STRING>""" ("," <TAGLIST> )* | "["<STRING>"]" ("," <TAGLIST> )* | <TAGLIST> <ACTION> ::= (<INDEX>)? ( <AFFIXATION> ("," <AFFIXATION>)* )* ( <ATT_CHANGE> ("," <ATT_CHANGE>)* )* <AFFIXATION> ::= <PREFIXATION> | <SUFFIXATION> | <INFIXATION> | <REPLACEMENT> (cf. A-rule) <ATT_CHANGE> ::= { "+" | "-" } <TAG> <TAGLIST> ::= <INDEX> | (<INDEX> ",")? <TAG> ("," <TAG>)* <INDEX> ::= "%"[01..99] <TAG> ::= {one of the tags defined in the UNDLF Tagset} <STRING> ::= [a-Z]+ <INTEGER> ::= [0-9]+
where
<a> = a is a non-terminal symbol
“a“ = a is a constant
a | b = a or b
{ a | b } = either a or b
(a)? = a can occur 0 or 1 time
(a)* = a can be repeated 0 or more times
(a)+ = a can be repeated 1 or more times