CPWR
(→How to create a composition rule) |
(→When to use composition rules) |
||
Line 7: | Line 7: | ||
*when the multiword expression is discontinuous. | *when the multiword expression is discontinuous. | ||
For instance:<br /> | For instance:<br /> | ||
− | The English multiword expression "call for" has the following inflections: "call for", "call'''s''' for", "call'''ed''' for", "call'''ing''' for", etc. These inflections are formed by infixation, in the sense they apply in the middle of the expression. If we simply associate this expression to the inflectional paradigm of "call", we will have the following results: "call for", "call for'''s'''", "call for'''ed'''", "call for'''ing'''", etc. In order to prevent this problem, and to avoid the unnecessary proliferation of rules in the grammar, we split the multiword expression into two segments: the '''base form''' (BF), i.e., the term over which the inflections will be directly applied; and the '''composition rule''' (CPWR), which is the rule used to rebuild the lemma out of the base form. | + | The English multiword expression "call for" has the following inflections: "call for", "call'''s''' for", "call'''ed''' for", "call'''ing''' for", etc. These inflections are formed by infixation, in the sense they apply in the middle of the expression (between "call" and "for"). If we simply associate this expression to the inflectional paradigm of "call", we will have the following results: "call for", "call for'''s'''", "call for'''ed'''", "call for'''ing'''", etc. In order to prevent this problem, and to avoid the unnecessary proliferation of rules in the grammar, we split the multiword expression into two segments: the '''base form''' (BF), i.e., the term over which the inflections will be directly applied; and the '''composition rule''' (CPWR), which is the rule used to rebuild the lemma out of the base form. |
== When not to use composition rules == | == When not to use composition rules == |
Revision as of 18:59, 8 December 2011
Composition rules (CPWR) are used to generate compounds out of the base form.
Contents |
When to use composition rules
Composition rules must be created when and only when the base form is different from the lemma.
This situation occurs only in case of the following multiword expressions:
- when inflections are formed by infixation (in opposition to simple suffixation or prefixation); or
- when the multiword expression is discontinuous.
For instance:
The English multiword expression "call for" has the following inflections: "call for", "calls for", "called for", "calling for", etc. These inflections are formed by infixation, in the sense they apply in the middle of the expression (between "call" and "for"). If we simply associate this expression to the inflectional paradigm of "call", we will have the following results: "call for", "call fors", "call fored", "call foring", etc. In order to prevent this problem, and to avoid the unnecessary proliferation of rules in the grammar, we split the multiword expression into two segments: the base form (BF), i.e., the term over which the inflections will be directly applied; and the composition rule (CPWR), which is the rule used to rebuild the lemma out of the base form.
When not to use composition rules
Composition rules must not be used in the following circumstances:
- When the word is not a multiword expression;
- When the inflections of the multiword expression are formed by prefixation or suffixation (such as in "call center" > "call centers");
How to create a composition rule
The syntax for composition rules is the following:
<SYNTACTIC ROLE>(<ADDED>,<FEATURES);
Where:
- <SYNTACTIC ROLE> is the syntactic role (VA, VC, VS, VH, etc) of the term to be added to the base form;
- <ADDED> is the term to be added to the base form to form the compound, along with the corresponding features. It must be represented between [brackets], if it is a lemma (i.e., if it is an entry in the dictionary), or between "quotes", if a string (i.e., if it is not an entry in the dictionary)
- <FEATURES> are the features of the term to be added to the base form. The following features are mandatory:
- the lexical category (A,J,N,V,C,P,D) of the term to be added
- the inflectional properties (paradigm and/or inflectional rules) of the term to be added
- the distribution (i.e., the position) of the term to be added, if not default
- the adjacency of the term to be added, if not default
Examples of composition rules
- CPWR=MTW(VH([in],P,M0);) (add the lemma [in], which is a preposition (P) and invariant (M0), as part of the head of the verbal phrase (VH), as in "give">"give in")
- CPWR=MTW(VA("into account",A,M0);) (add the string "into account", which is an adverb (A) and invariant (M0), as an adjunct to the head of the verbal phrase (VA), as in "take">"take into account")
Composition rules in the dictionary
In the UNLarium frameword, composition rules may be expressed in two different formats:
- As complex structures, such as
[[sub-NLW][sub-NLW]...[sub-NLW]] {ID} “UW” (ATTR , ..., #01(ATTR, ...), #02(ATTR, ...), ...) < FLG , FRE , PRI >; COMMENTS
- For further information on complex structures inside the dictionary, refer to Dictionary Specs#Complex structures as NLW*
- As simple structures, such as
[NLW] {ID} “UW” (ATTR , ..., BF=<BASE FORM>, CPWR=MTW(<COMPOSITION RULE>) ) < FLG , FRE , PRI >; COMMENTS
- Where <COMPOSITION RULE> is the rule or set of rules used to form the lemma out of the base form, and <BASE FORM> is the base form
Example of dictionary entries containing composition rules
- [[bring] [back]] {12343} "202078294" (pos=VER, #01(IFX(ET0:=4>"ought")), #02(pos=PRE)) <eng, 0, 0>;
- [bring back] {12343} "202078294" (pos=VER, BF=bring, CPWR=MTW(VA([back],A,M0);)) <eng, 0, 0>;