C-rule

From UNL Wiki
(Difference between revisions)
Jump to: navigation, search
(Observations)
(Observations)
Line 47: Line 47:
 
;Complex compounds
 
;Complex compounds
 
:Compounds must include as many terms as different syntactic roles:
 
:Compounds must include as many terms as different syntactic roles:
*give up the gost = +VH([up])VC("the ghost"); (<strike>+VH("up the ghost")</strike> or <strike>VC("up the ghost")</strike>)
+
:*give up the gost = +VH([up])VC("the ghost"); (<strike>+VH("up the ghost")</strike> or <strike>VC("up the ghost")</strike>)
 
;Order is to be represented by the [[Distribution|distribution features]] (">", ">>", "<", "<<", ...), if not default:
 
;Order is to be represented by the [[Distribution|distribution features]] (">", ">>", "<", "<<", ...), if not default:
 
:*+VC([love]); (order must not be informed, because in English complements come at the right side by default: ''make'' > ''make love'')
 
:*+VC([love]); (order must not be informed, because in English complements come at the right side by default: ''make'' > ''make love'')

Revision as of 09:35, 26 March 2010

Compounding or composition is the word-formation process of creating compounds by combining or putting together lexemes.

Expressing compounds in the UNLarium

In the UNLarium framework, compounds are treated as ordinary simple words except in case of discontinuous multi-word expressions or with infixation (such as "give in" or "take into account"). In these cases, the lemma is different from the base form, and the compound-formation process is expected to be defined through S-rules such as the following:

+<SYNTACTIC ROLE>(<ADDED>);

Where:
<SYNTACTIC ROLE> is the syntactic role (VA, VC, VS, VH, etc) of the term to be added to the base form; and
<ADDED> is the term to be added to the base form to form the compound. It can be a string between "quotes" or a lemma between [brackets].

Examples

Lemma Base Form Compound Description
give in give +VH([in]) the lemma "in" is to be added to the base form as part of the head of the verb (VH)
take into account take +VA("into account") the string "into account" is to be added to the base form as an adjunct to the verb (VA)
throw <person> to the lions throw +VA("to the lions") the string "to the lions" is to be added to the base form as an adjunct to the verb (VA)

Observations

Phrasal verbs
Particles of phrasal verbs must be represented as part of the head, if non separable, or as adjuncts, if separable:
  • give in = +VH("in"); ("give in something" but "give something in")
  • give back = +VA("back"); ("give back something" or "give something back")
General syntactic roles (NP, PP, XP) must not be defined in composition rules but inside the subcategorization frame
  • throw <person> to the lions = +VA("to the lions"); (and not "+VA("to the lions")VC(NP);". The lemma should be associated to the transitive frame instead)
"Quotes" or [brackets]?
In the compound-formation process, the UNLarium distinguishes between strings (to be represented between "") and lemmas (to be represented between [ ]). The difference between strings and lemmas has to do with the dictionary status. Lemmas, but not strings, are expected to be defined as dictionary entries:
  • +VA("into account"); (add the string "into account" as a verbal adjunct: take > take into account)
  • +VC([love]); (add the lemma "love" as a verbal complement: make > make love)
In the above, it's unlikely to have "into account" as a single entry, whereas "love" is probably already there. This is important because terms of compounds may be modified ("make much love", for instance).
Complex compounds
Compounds must include as many terms as different syntactic roles:
  • give up the gost = +VH([up])VC("the ghost"); (+VH("up the ghost") or VC("up the ghost"))
Order is to be represented by the distribution features (">", ">>", "<", "<<", ...), if not default
  • +VC([love]); (order must not be informed, because in English complements come at the right side by default: make > make love)
  • +NS([the]); (order must not be informed, because in English specifiers come at the left side, by default: Netherlands > the Netherlands)
  • NA(>>,"available"); (order must be informed, because in English nominal adjuncts come at the left side, by default: table > new table)
Adjacency is to be represented by the adjacency features (AJ0,AJ1,AJ2,...), if not default
  • +VC([love]); (adjacency must not be informed, because in English complements come after the head, by default: make > make love)
  • +VH([up])VC("the ghost"); (adjacency must not be informed, because in English head particles come before complements, by default: give > give up the ghost)
  • +VA([home],AJ1)VC("the bacon",AJ2); (adjacency must be informed because in English the complement is normally generated before the adjunct: bring the bacon home)
Software