Grammar
In the UNL framework, a grammar is a set of rules that are used to generate UNL out of natural language, and UNL out of natural language.
Contents |
Direction
In the UNLframework, we distinguish between analysis and generation grammars:
- The UNL-NL (Generation) Grammar is used to generate natural language out of UNL
- The NL-UNL (Analysis) Grammar is used to generate UNL out of natural language
Types
Main article: Grammar Specs
In the UNLframework, we distinguish between transformation and disambiguation grammars:
- Transformation Grammar, or T-Grammar, is the set of T-rules, which are used to transform structures[1]
- Disambiguation Grammar, or D-Grammar, the set of D-rules, which are used to improve the performance of the T-rules
The UNL-NL Grammar and the NL-UNL Grammar consist of two different types of rules:
- T-rules[2], or transformation rules, are used to modify structures. T-rules are further divided in:
- D-rules, or disambiguation rules, are used to assign priorities
Type | Rule | Description | Example |
---|---|---|---|
D-rule | (ART)(ART)=0; | It's not possible to have an article after another article | |
A-rule | PLR:=0>"s"; | In case of plural (PLR), add "s" to the end of the word | table > tables, boy > boys |
L-rule | ("I")(BLK)("am"):=("I'm"); | In case of "I" before a blank space and "am", replace "I" by "I'm" | I am > I'm |
S-rule | MTW:=VA("into account"); | In order to form the multiword expression, add "into account" as an adjunct to the verb (VA). | take > take into account |
Syntax
D-rules are defined by the general syntax:
<CONDITION> = <PRIORITY>;
While T-rules are defined as:
<CONDITION> := <ACTION>;
Both rules always end in a semicolon (";"). Special symbols and notation apply in each case. For further information, see D-rules, A-rules, L-rules or S-rules.
When to use D-rules
D-rules must be used to assign priorities. They do not provoke any changes, but only induce or prohibit transformations.
When to use T-rules
T-rules are used for changes, and vary according to the scope of the changes:
- A-rules are used when the transformations apply over isolated forms to generate inflections of the base form. They are used only when the transformations may be expressed by prefixation, infixation or suffixation. In any case, the transformation must affect only the structure of the word; the structure of the phrase is preserved. In that sense, A-rules must never be used when a new word is introduced in the syntactic structure (as in the formation of compounds).
- L-rules are used when the transformations affect a linear sequence of isolated forms. The transformations are rather at the surface level and do not affect the deep structure of the phrase.
- S-rules are used when the transformations affect the structure of the phrase, as in the generation of compounds (including compound tenses and periphrastic constructions). They are also used to describe syntactic behaviour such as word order, agreement and government.
Notes
- ↑ To convert a list structure into a tree structure, a tree structure into a list structure, a tree structure into a network structure, and so on.
- ↑ In order for T-rules to be processed in the UNLdev, they should comply with the syntax defined in the Grammar Specs. For simplification reasons, the rules here presented may omit some of the necessary features required by the UNLdev, which are, however, automatically provided by the UNLarium