Grammar
Grammar is the set of logical or structural rules that govern the composition of sentences, phrases and words.
Contents |
Types of Grammar
In the UNLdev, we distinguish two types of grammar:
- The UNL-NL (Generation) Grammar is used to generate natural language out of UNL
- The NL-UNL (Analysis) Grammar is used to generate UNL out of natural language
Both the generation and the analysis grammar are divided in two sub-grammars
These grammars are normally provided as plain-text documents with UTF-8 encoding in order to be compiled by the UNDL Foundation's tools, such as IAN, EUGENE and SEAN.
Types of Rules
main article: Grammar Specs In the UNL framework, grammars are are two basic types of rules:
- Transformation rules
- Used to generate natural language sentences out of UNL graphs and vice-versa.
- Disambiguation rules
- Used to improve the performance of transformation rules by constraining their applicability.
The Transformation Rules follow the very general formalism
α:=β;
where the left side α is a condition statement, and the right side β is an action to be performed over α.
The Disambiguation Rules, which were directly inspired by the UNL Centre's former co-occurrence dictionary and knowledge base, follows a slightly different formalism:
α=P;
where the left side α is a statement and the right side P is an integer from 0 to 255 that indicates the probability of occurrence of α. In the UNL Grammar there are two basic types of rules:
- Transformation rules
- Used to generate natural language sentences out of UNL graphs and vice-versa.
- Disambiguation rules
- Used to improve the performance of transformation rules by constraining their applicability.
The UNL-NL Grammar and the NL-UNL Grammar consist of two different types of rules:
- T-rules[1], or transformation rules, are used to modify structures. T-rules are further divided in:
- D-rules, or disambiguation rules, are used to assign priorities
Type | Rule | Description | Example |
---|---|---|---|
D-rule | (ART)(ART)=0; | It's not possible to have an article after another article | |
A-rule | PLR:=0>"s"; | In case of plural (PLR), add "s" to the end of the word | table > tables, boy > boys |
L-rule | ("I")(BLK)("am"):=("I'm"); | In case of "I" before a blank space and "am", replace "I" by "I'm" | I am > I'm |
S-rule | MTW:=VA("into account"); | In order to form the multiword expression, add "into account" as an adjunct to the verb (VA). | take > take into account |
Syntax
D-rules are defined by the general syntax:
<CONDITION> = <PRIORITY>;
While T-rules are defined as:
<CONDITION> := <ACTION>;
Both rules always end in a semicolon (";"). Special symbols and notation apply in each case. For further information, see D-rules, A-rules, L-rules or S-rules.
When to use D-rules
D-rules must be used to assign priorities. They do not provoke any changes, but only induce or prohibit transformations.
When to use T-rules
T-rules are used for changes, and vary according to the scope of the changes:
- A-rules are used when the transformations apply over isolated forms to generate inflections of the base form. They are used only when the transformations may be expressed by prefixation, infixation or suffixation. In any case, the transformation must affect only the structure of the word; the structure of the phrase is preserved. In that sense, A-rules must never be used when a new word is introduced in the syntactic structure (as in the formation of compounds).
- L-rules are used when the transformations affect a linear sequence of isolated forms. The transformations are rather at the surface level and do not affect the deep structure of the phrase.
- S-rules are used when the transformations affect the structure of the phrase, as in the generation of compounds (including compound tenses and periphrastic constructions). They are also used to describe syntactic behaviour such as word order, agreement and government.
Notes
- ↑ In order for T-rules to be processed in the UNLdev, they should comply with the syntax defined in the Grammar Specs. For simplification reasons, the rules here presented may omit some of the necessary features required by the UNLdev, which are, however, automatically provided by the UNLarium