L-rule

From UNL Wiki

(Difference between revisions)

Revision as of 14:56, 21 March 2010

Ph-rule (phonetic rule) is the formalism used for generating spelling changes in the UNL^arium framework.

When to use Ph-rules

Ph-rules are used for generating sound changes that produce spelling changes (such as "a">"an", "I am">"I'm", etc). They are also used to generate spelling conventions, such as the use of capital letters and punctuation marks.

When not to use Ph-rules

Ph-rules are not to be used for sound changes that do not affect spelling.

Syntax

The general syntax for Ph-rules is the following:

(CONDITION) := (ACTION);

Where:

CONDITION is a single form or a sequence of forms over which actions will take place; and
ACTION is the action to be performed over each form or sequence of forms of the CONDITION.

CONDITION and ACTION may be expressed as:

a character or string of characters, between quotes: ("a");
a tag or list of tags, extracted from the UNDL Foundation tagset: (VOW);
a combination of characters and tags: ("a",PRE);

Examples:

("Mr."):=("Mister"); (replace "Mr." by "Mister")
("doctor"):=("dr."); (replace "doctor" by "dr.")

Conditions and actions must always come between parentheses

("Mr."):=("Mister");
~~"Mr.":="Mister";~~

Context-sensitiveness

Ph-rules are normally sensitive to the context and apply over a set of conditions rather than over isolated word forms. In this case, each separate word form must be isolated between parentheses and described as a different condition.

~~("I am"):=("I'm)~~;
("I")(BLK)("am"):=("I'm");

Types of Ph-rules

There are basically three types of Ph-rules:

REPLACEMENT, when the number of parentheses in the CONDITION field is equal to the number of parentheses in the ACTION field:
ADDITION, when the number of parentheses in the CONDITION field is lower than the number of parentheses in the ACTION field;
DELETION, when the number of parentheses in the CONDITION field is greater than the number parentheses in the ACTION field.

Parentheses are automatically co-indexed between the CONDITION and the ACTION field, so that the first pair of parentheses of the CONDITION field corresponds to the first pair of parentheses of the ACTION field, and so on. This means that parentheses are to be repeated on the right side of a Ph-rule if they are not expected to be deleted. In order to control the process of adding, deleting and reordering parentheses, they must be referred by the index "%N" where is the order of appearance in the left side: Examples:

- ("a")("b")("c"):=("d")("e")("f"); ("a" will be replaced by "d"; "b" by "e"; and "c" by "f")
- ("a")("b")("c"):=("d")()(); ("a" will be replaced by "d"; "b" and "c" will be preserved)
- ("a")("b")("c"):=("d")("")(""); ("a" will be replaced by "d"; "b" and "c" will be replaced by "")
- ("a")("b")("c"):=("d")(); ("a" will be replaced by "d"; "b" will be preserved; "c" will be deleted)
- ("a")("b")("c"):=("d"); ("a" will be replaced by "d"; "b" and "c" will be deleted)
- ("a")("b")("c"):=(%03)(%02)(%01); ("a", "b" and "c" will be preserved, but reordered: ("c")("b")("a"));
- ("a")("b")("c"):=("d")(%03); ("a" will be replaced by "d"; "b" will be deleted; "c" will be preserved);
- ("a")("b")("c"):=("d")("g")()(); ("a" will be replaced by "d"; "b" will be replaced by "g"; "c" will be preserved; and new form will be generate after it);
- ("a")("b")("c"):=("d")("g")(%02)(%03); ("a" will be replaced by "d"; "g" will be generated after it; and then "b" and "c", which will be preserved);

Examples

CASE	RULE	BEHAVIOUR	BEFORE	AFTER
Dissimilation	("a",ART)(BLK)(VOW):=("an")()();	replace the article "a" by "an" before a blank space and a vowel	a adjective	an adjective
Crasis	("a",PRE)(BLK)("a",ART):=("à",ART,CTC);	replace the preposition "a" in front of blank and "a" by "à"; add the features ART (article) and CTC (contraction); and delete the blank and the second "a"	a a	à = (PRE,ART,CTC)
Contraction	("de",PRE)(BLK)("le",ART):=("du",ART,CTC);	replace the preposition "de" in front of blank and "le" by "du"; add the features ART and CTC; and delete the blank and "le"	de le	du = (PRE,ART,CTC)
Epenthesis	("a",VER)(BLK)("il",PPR):=()("-t-",-BLK)();	replace the blank space between the verb "a" and the pronoun "il" by "-t-"	a il	a-t-il
Elision	("de",PRE)(BLK)(VOW):=("d'")(%3);	replace the preposition "de" before a blank space and a vowel by "d'" and delete the blank space	de avoir	d'avoir

Observations

Rules will only be applied if all conditions are true: X:=”y”<”z”; ( “zabc” changes to “yabc”, but “abc” remains “abc” since there is no "z" to be replaced)
String fields are necessarily continuous: X:=”aaa”<”xyz”; ( “xyzbbb” changes to “aaabbb”, but “bxbybz” remains “bxbybz” since there is no continuous string "xyz" to be replaced)
Each action is applied only once (i.e, rules are not exhaustive): PLR:=0>”s”; ("X" becomes "Xs", and not "Xssssss...")
The replacement rule applies only once to the same string: X:=”a”:”b”; ( “aaa” becomes “baa” and not “bbb”)
In prefixation and suffixation rules, the part to be deleted may be represented by the number of characters (without quotes)

PLR := “X”<””;	=	PLR := “X”<0;	(ABC becomes XABC)
PLR:= “X”<”A”;	=	PLR:= “X”<1;	(ABC becomes XBC)
PLR:= “XY”<”AB”;	=	PLR:= “XY”<2;	(ABC becomes XYC)
PLR:=””>”X”;	=	PLR:= 0>”X”;	(ABC becomes ABCX)
PLR:=”C”>”X”;	=	PLR:= 1>”X”;	(ABC becomes ABX)
PLR:=”BC”>”XY”;	=	PLR:= 2>”XY”;	(ABC becomes AXY)

In infixation rules, the position of the addition may be made with reference to the end of string by using "-".

RULE	BEHAVIOR	BEFORE	AFTER
X:=[1]>"y";	if X add "y" to the right of the first character	abc	aybc
X:=[-1]>"y";	if X add "y" to the right of the last character	abc	abyc
X:="y"<[2];	if X add "y" to the left of the second character	abcde	aybc
X:="y"<[-2];	if X add "y" to the left of the second character	abcde	abcyde

In replacement rules, the part to be deleted may be omitted if the whole string is to be replaced

PLR:=”ABC”:”XYZ”;

=

PLR:=”XYZ”

(ABC becomes XYZ)

In replacement rules, the part to be deleted may be represented by an interval of characters in the format [beginning-end]

PLR:=”B”:”X”;

=

PLR:=[2-2]:”X”;

(ABC becomes AXC)

The symbol “^” is used for negation (“^MCL” means “not MCL”): NOU&^MCL:=”x”:”y”; (If NOU and not MCL then replace “x” by “y”)
“<<” and “>>” add blank spaces^[1]: X:=”a”<<”b” (“bc” becomes “a bc” and not “abc”)

Common mistakes

nou:= ”y”<”z”; (WRONG: Tags are case sensitive)
NNN:= ”y”<”z”; (WRONG: NNN is not defined in the tagset)
NOUFEM:=”y”<”z”; (WRONG: Tags must be separated by “&”)
NOU,FEM:=”y”<”z”; (WRONG: Tags must be separated by “&”)
NOU & FEM:=”y”<”z”; (WRONG: There can be no blank spaces between tags)
X:=1<1; (WRONG: The left side must always be a string in a prefixation rule)
X:=1>1; (WRONG: The right side must always be a string in a suffixation rule)
X:=1; (WRONG: Replacement rules do not allow for numbers)
X:=1:1; (WRONG: Replacement rules do not allow for numbers)

Complex a-rules

Complex a-rules are formed from the combination of simple a-rules:

circumfixation (prefixation + suffixation), to add a prefix and a suffix at the same time
prefixation + infixation, to add a prefix and a suffix at the same time
infixation + suffixation, to add an infix and a suffix at the same time
prefixation + infixation + suffixation, to add a prefix, an infix and a suffix at the same time

Syntax

Complex a-rules are formed by concatenating simple a-rules with ",":

circumfixation

CONDITION := “ADDED” < DELETED , DELETED > "ADDED";

prefixation + infixation

CONDITION := “ADDED” < DELETED , DELETED > "ADDED";

infixation + suffixation

CONDITION := DELETED > "ADDED" , "DELETED" > "ADDED";

etc.

Examples

Complex m-rules
RULE	BEHAVIOR	BEFORE	AFTER
X:=”x”<0, 0>"y";	if X add "x" to the beginning and "z" to the end of the string	A	xAy
X:=”x”<0, "A":"y";	if X add "x" to the beginning and replace "A" by "y"	ABC	xyBC
X:="A":"y", 0>"x";	if X replace "A" by "y" and add "x" to the end of the string	ABC	yBCx
X:=”x”<0, "A":"y", 0>"z";	if X add "x" to the beginning, replace "A" by "y" and add "z" to the end of the string	ABC	xyBCz

Observations

Complex a-rules are also used to integrate different simple a-rules

ORD:="1">"1st";
ORD:="2">"2nd";
ORD:="3">"3rd";

ORD:="1">"1st", "2">"2nd", "3">"3rd";

Actions are applied from left to right (i.e., order is important): PLR := "s" > "ses", "y" > "ies"; (kiss > kisses, city > cities); PLR := "y" > "ies", "s" > "ses"; (kiss > kisses, city>cities>citieses)

Formal syntax

A-rules comply with the following syntax:

<A-RULE>           ::= <CONDITION> “:=” <ACTION> ("," <ACTION>)* “;”
<CONDITION>        ::= <ATAG>(“&”(“^”)?<ATAG>)*
<ATAG>             ::= {one of the tags defined in the UNDLF Tagset}
<ACTION>           ::= <PREFIXATION> | <SUFFIXATION> | <INFIXATION> | <REPLACEMENT>
<PREFIXATION>      ::= <ADDED>	 {“<” | “<<”} 	(<DELETED>)?
<SUFFIXATION>      ::= (<DELETED>)? {“>” | “>>”} 	<ADDED>
<INFIXATION>       ::= "["<DELETED"]" ">" <ADDED> | <ADDED> "<" "["<DELETED"]"
<REPLACEMENT>      ::= ( <STRING> ":" )? <ADDED> | "[" <INTEGER> "-" <INTEGER> "]" ":"  <ADDED>
<ADDED>            ::= <STRING> 
<DELETED>          ::= <STRING> | <INTEGER>  
<STRING>           ::= “ “ “ [a..Z]+ “ “ “
<INTEGER>          ::= [0..9]+

where

<a> = a is a non-terminal symbol
“a“ = a is a constant
a | b = a or b
{ a | b } = either a or b
(a)? = a can occur 0 or 1 time
(a)* = a can be repeated 0 or more times
(a)+ = a can be repeated 1 or more times

Notes

↑ This feature is not supported by the UNL^dev and it is automatically replaced, in the UNL^arium, by a blank space.

[0] This feature is not supported by the UNL^dev and it is automatically replaced, in the UNL^arium, by a blank space.

[1]

@@ Line 31: / Line 31: @@
 == Types of Ph-rules ==
 There are basically three types of Ph-rules:
-*REPLACEMENT, when the number of parentheses on the CONDITION field is equal to the number of parentheses in the ACTION field:
+*REPLACEMENT, when the number of parentheses in the CONDITION field is equal to the number of parentheses in the ACTION field:
- (1)(2)(...)(N):=(1)(2)(...)(N);
+*ADDITION, when the number of parentheses in the CONDITION field is lower than the number of parentheses in the ACTION field;
-*ADDITION, when the number of parentheses on the CONDITION field is lower than the number of parentheses in the ACTION field;
+*DELETION, when the number of parentheses in the CONDITION field is greater than the number parentheses in the ACTION field.
- (1)(2)(...)(N):=(1)(2)(...)(N)(N+1)(...);
+Parentheses are automatically co-indexed between the CONDITION and the ACTION field, so that the first pair of parentheses of the CONDITION field corresponds to the first pair of parentheses of the ACTION field, and so on. This means that parentheses are to be repeated on the right side of a Ph-rule if they are not expected to be deleted. In order to control the process of adding, deleting and reordering parentheses, they must be referred by the index "%N" where is the order of appearance in the left side:
-*DELETION, when the number of parentheses on the CONDITION field is greater than the number of parentheses in the ACTION field.
+Examples:
- (1)(2)(...)(N):=(1)(2)(...)(N-1);
+**("a")("b")("c"):=("d")("e")("f"); ("a" will be replaced by "d"; "b" by "e"; and "c" by "f")
+**("a")("b")("c"):=("d")()(); ("a" will be replaced by "d"; "b" and "c" will be preserved)
+**("a")("b")("c"):=("d")("")(""); ("a" will be replaced by "d"; "b" and "c" will be replaced by "")
+**("a")("b")("c"):=("d")(); ("a" will be replaced by "d"; "b" will be preserved; "c" will be deleted)
+**("a")("b")("c"):=("d"); ("a" will be replaced by "d"; "b" and "c" will be deleted)
+**("a")("b")("c"):=(%03)(%02)(%01); ("a", "b" and "c" will be preserved, but reordered: ("c")("b")("a"));
+**("a")("b")("c"):=("d")(%03); ("a" will be replaced by "d"; "b" will be deleted; "c" will be preserved);
+**("a")("b")("c"):=("d")("g")()(); ("a" will be replaced by "d"; "b" will be replaced by "g"; "c" will be preserved; and new form will be generate after it);
+**("a")("b")("c"):=("d")("g")(%02)(%03); ("a" will be replaced by "d"; "g" will be generated after it; and then "b" and "c", which will be preserved);
 === Examples ===
 {|border="1" align="center" cellpadding="2"

L-rule

Revision as of 14:56, 21 March 2010

Contents

When to use Ph-rules

When not to use Ph-rules

Syntax

Types of Ph-rules

Examples

Observations

Common mistakes

Complex a-rules

Syntax

Examples

Observations

Formal syntax

Notes

Views

Personal tools

Search

UNL

Lingware

Software

UNL Program

Navigation

Toolbox

Print/export