Grammar

From UNL Wiki
(Difference between revisions)
Jump to: navigation, search
(Types of Grammar)
Line 1: Line 1:
'''Grammar''' is the set of logical or structural rules that govern the composition of sentences, phrases and words.  
+
In the UNL framework, a '''grammar''' is a set of rules that are used to generate UNL out of natural language, and UNL out of natural language.  
  
== Types of Grammar ==
+
== Direction ==
In the UNL<sup>dev</sup>, we distinguish two types of grammar:
+
In the UNL<sup>framework</sup>, we distinguish between '''analysis''' and '''generation''' grammars:
 
*The UNL-NL (Generation) Grammar is used to generate natural language out of UNL
 
*The UNL-NL (Generation) Grammar is used to generate natural language out of UNL
 
*The NL-UNL (Analysis) Grammar is used to generate UNL out of natural language
 
*The NL-UNL (Analysis) Grammar is used to generate UNL out of natural language
Both the generation and the analysis grammar are divided in two sub-grammars
 
  
 +
== Types ==
 +
In the UNL<sup>framework</sup>, we distinguish between '''transformation''' and '''disambiguation''' grammars:
 +
*Transformation Grammar, or T-Grammar, is used to transform structures<ref>To convert a list structure into a tree structure, a tree structure into a list structure, a tree structure into a network structure, and so on.</ref>
 +
*Disambiguation Grammar, or D-Grammar, is used to improve the performance of the T-Grammar
  
These grammars are normally provided as plain-text documents with UTF-8 encoding in order to be compiled by the UNDL Foundation's tools, such as [[IAN]], [[EUGENE]] and [[SEAN]].
 
== Types of Rules ==
 
[[Grammar Specs|main article: Grammar Specs]]
 
In the UNL framework, grammars are are two basic types of rules:
 
;Transformation rules
 
:Used to generate natural language sentences out of UNL graphs and vice-versa.
 
;Disambiguation rules
 
:Used to improve the performance of transformation rules by constraining their applicability.
 
  
The Transformation Rules follow the very general formalism
 
 
α:=β;
 
 
where the left side α is a condition statement, and the right side β is an action to be performed over α.
 
 
The Disambiguation Rules, which were directly inspired by the UNL Centre's former co-occurrence dictionary and knowledge base, follows a slightly different formalism:
 
 
α=P;
 
 
where the left side α is a statement and the right side P is an integer from 0 to 255 that indicates the probability of occurrence of α.
 
In the UNL Grammar there are two basic types of rules:
 
 
;Transformation rules
 
:Used to generate natural language sentences out of UNL graphs and vice-versa.
 
;Disambiguation rules
 
:Used to improve the performance of transformation rules by constraining their applicability.
 
  
 
The UNL-NL Grammar and the NL-UNL Grammar consist of two different types of rules:
 
The UNL-NL Grammar and the NL-UNL Grammar consist of two different types of rules:

Revision as of 18:22, 17 September 2012

In the UNL framework, a grammar is a set of rules that are used to generate UNL out of natural language, and UNL out of natural language.

Contents

Direction

In the UNLframework, we distinguish between analysis and generation grammars:

  • The UNL-NL (Generation) Grammar is used to generate natural language out of UNL
  • The NL-UNL (Analysis) Grammar is used to generate UNL out of natural language

Types

In the UNLframework, we distinguish between transformation and disambiguation grammars:

  • Transformation Grammar, or T-Grammar, is used to transform structures[1]
  • Disambiguation Grammar, or D-Grammar, is used to improve the performance of the T-Grammar


The UNL-NL Grammar and the NL-UNL Grammar consist of two different types of rules:

  • T-rules[2], or transformation rules, are used to modify structures. T-rules are further divided in:
    • A-rules (affixation rules) apply over isolated word forms (as to generate possible inflections);
    • L-rules (linear rules) apply over lists of word forms (as to provide transformations in the surface structure);
    • S-rules (syntactic rules) apply over trees (as to modify the syntactic configuration).
  • D-rules, or disambiguation rules, are used to assign priorities
Examples of Grammar Rules
Type Rule Description Example
D-rule (ART)(ART)=0; It's not possible to have an article after another article
A-rule PLR:=0>"s"; In case of plural (PLR), add "s" to the end of the word table > tables, boy > boys
L-rule ("I")(BLK)("am"):=("I'm"); In case of "I" before a blank space and "am", replace "I" by "I'm" I am > I'm
S-rule MTW:=VA("into account"); In order to form the multiword expression, add "into account" as an adjunct to the verb (VA). take > take into account

Syntax

D-rules are defined by the general syntax:

<CONDITION> = <PRIORITY>;

While T-rules are defined as:

<CONDITION> := <ACTION>;

Both rules always end in a semicolon (";"). Special symbols and notation apply in each case. For further information, see D-rules, A-rules, L-rules or S-rules.

When to use D-rules

D-rules must be used to assign priorities. They do not provoke any changes, but only induce or prohibit transformations.

When to use T-rules

T-rules are used for changes, and vary according to the scope of the changes:

  • A-rules are used when the transformations apply over isolated forms to generate inflections of the base form. They are used only when the transformations may be expressed by prefixation, infixation or suffixation. In any case, the transformation must affect only the structure of the word; the structure of the phrase is preserved. In that sense, A-rules must never be used when a new word is introduced in the syntactic structure (as in the formation of compounds).
  • L-rules are used when the transformations affect a linear sequence of isolated forms. The transformations are rather at the surface level and do not affect the deep structure of the phrase.
  • S-rules are used when the transformations affect the structure of the phrase, as in the generation of compounds (including compound tenses and periphrastic constructions). They are also used to describe syntactic behaviour such as word order, agreement and government.

Notes

  1. To convert a list structure into a tree structure, a tree structure into a list structure, a tree structure into a network structure, and so on.
  2. In order for T-rules to be processed in the UNLdev, they should comply with the syntax defined in the Grammar Specs. For simplification reasons, the rules here presented may omit some of the necessary features required by the UNLdev, which are, however, automatically provided by the UNLarium
Software