Multiword expression
Multiword Expressions (MTW) are lexical structures made up of a sequence of two or more lexemes. They can be concatenated ("darkroom", "skinhead") or isolated by hyphens ("blue-green", "African-American") or blank spaces ("round table", "part of speech"). Multiword expressions can be continuous ("get over") or discontinuous ("get <something> together"). They correspond to compounds ("fireman", "hardware"), phrases ("in spite of", "take into account"), idioms ("kick the bucket", "play cat and mouse"), fragments of sentences ("and so on", "whatever the case") or sentences ("Every evil is followed by some good", "No flies enter a mouth that is shut"). Multiword expressions may also include acronyms (such as "UNESCO"), multiple-word contractions (such as "don't") and blends (such as "sitcom") that are still analysable (differently from "radar" and "motel", which are represented as simple words). Classical compounds ("agriculture", "photograph") and their derivations ("agricultural", "photographically") are treated as simple words if they do not include more than one free morpheme. Phrasal verbs ("give in", "come across") are treated as multiword expressions.
How to treat multiword expressions in the UNLarium
- Lemma
- The lemma of the multiword expression is the multiword expression itsel.
- Base form
- Composition rules
- Inflectional paradigm
- Subcategorization frame