C-rule
From UNL Wiki
(Difference between revisions)
Line 2: | Line 2: | ||
== Expressing compounds in the UNL<sup>arium</sup> == | == Expressing compounds in the UNL<sup>arium</sup> == | ||
− | In the UNL<sup>arium</sup> framework, compounds are | + | In the UNL<sup>arium</sup> framework, compounds are treated as ordinary simple words except in case of discontinuous multi-word expressions or with infixation (such as "give in" or "take into account"). In these cases, the [[lemma]] is different from the [[base form]], and the compound-formation process is expected to be defined through specific rules: |
+ | *coffee house (multi-word expression without infixation: "coffee house">"coffee houses"): BF=lemma="coffee house"<br> | ||
+ | *give in (multi-word expression with infixation: "give in">"gave in"): BF="give" <code>≠</code> lemma="give in"<br> | ||
+ | *behind one's back (discontinuous multi-word expression without infixation: "behind my back", "behind his back", etc): BF="behind" <code>≠</code> lemma="behind <person>'s back"<br> | ||
+ | *take into account (discontinuous multi-word LRU with infixation: "take it into account", "took that into account"): BF="take" <code>≠</code> lemma="take into account" | ||
== Examples == | == Examples == | ||
Line 29: | Line 33: | ||
== Observation == | == Observation == | ||
;Phrasal verbs | ;Phrasal verbs | ||
− | Particles of phrasal verbs must be represented as part of the head, if non separable, or as adjuncts, otherwise: | + | :Particles of phrasal verbs must be represented as part of the head, if non separable, or as adjuncts, otherwise: |
*give in = VH([in]); ("give in something" but <strike>"give something in"</strike>) | *give in = VH([in]); ("give in something" but <strike>"give something in"</strike>) | ||
*give back = VA([back]); ("give back something" or "give something back") | *give back = VA([back]); ("give back something" or "give something back") | ||
− | + | ;Strings and lemmas | |
− | + | :In the compound-formation process, the UNL<sup>arium</sup> distinguishes between strings (to be represented between "") and lemmas (to be represented between [ ]). The difference between strings and lemmas has to do with the dictionary status. Lemmas, but not strings, are expected to be defined as dictionary entries: | |
− | + | *VA("into account"); (add the string "into account" as a verbal adjunct, take > take into account) | |
− | + | *VC([love]); (add the lemma "love" as a verbal complement, such as in make > make love) | |
+ | In the above, it's unlikely to have "into account" as a single entry, whereas "love" is probably already there. | ||
== Syntax == | == Syntax == | ||
Compounds may be explicitly expressed by [[S-rules]], a formalism for describing the syntactic structure of phrases. | Compounds may be explicitly expressed by [[S-rules]], a formalism for describing the syntactic structure of phrases. |
Revision as of 13:35, 23 March 2010
Compounding or composition is the word-formation process of creating compounds by combining or putting together lexemes.
Contents |
Expressing compounds in the UNLarium
In the UNLarium framework, compounds are treated as ordinary simple words except in case of discontinuous multi-word expressions or with infixation (such as "give in" or "take into account"). In these cases, the lemma is different from the base form, and the compound-formation process is expected to be defined through specific rules:
- coffee house (multi-word expression without infixation: "coffee house">"coffee houses"): BF=lemma="coffee house"
- give in (multi-word expression with infixation: "give in">"gave in"): BF="give"
≠
lemma="give in"
- behind one's back (discontinuous multi-word expression without infixation: "behind my back", "behind his back", etc): BF="behind"
≠
lemma="behind <person>'s back"
- take into account (discontinuous multi-word LRU with infixation: "take it into account", "took that into account"): BF="take"
≠
lemma="take into account"
Examples
Lemma | BF | Compound | Description |
---|---|---|---|
give in | give | VH([in]) | "in" is to be added to the base form as part of the head of the verb (VH) |
take into account | take | VA("into account") | "into account" is to be added to the base form as an adjunct to the verb (VA) |
throw <person> to the lions | throw | VA("to the lions"), VC(NP) | "to the lions" is to be added to the base form as an adjunct to the verb (VA) and a noun phrase (NP) is to be added as a complement to the verb (VC) |
Observation
- Phrasal verbs
- Particles of phrasal verbs must be represented as part of the head, if non separable, or as adjuncts, otherwise:
- give in = VH([in]); ("give in something" but
"give something in") - give back = VA([back]); ("give back something" or "give something back")
- Strings and lemmas
- In the compound-formation process, the UNLarium distinguishes between strings (to be represented between "") and lemmas (to be represented between [ ]). The difference between strings and lemmas has to do with the dictionary status. Lemmas, but not strings, are expected to be defined as dictionary entries:
- VA("into account"); (add the string "into account" as a verbal adjunct, take > take into account)
- VC([love]); (add the lemma "love" as a verbal complement, such as in make > make love)
In the above, it's unlikely to have "into account" as a single entry, whereas "love" is probably already there.
Syntax
Compounds may be explicitly expressed by S-rules, a formalism for describing the syntactic structure of phrases.