N-rule: Difference between revisions

Revision as of 15:29, 31 May 2013

Normalization rules are used to prepare the natural language input for automatic processing. They constitute the preprocessing module that applies over the input as a string and runs prior to the tokenization. The set of n-rules forms the Normalization Grammar, or N-Grammar.

Syntax

Normalization Rules follow the very general formalism

α:=β;

where the left side α is a condition statement, and the right side β is an action to be performed over α.

Type of Normalization Rules

Roles of Normalization Rules

Normalization roles They have two roles:

to normalize the input text (to replace abbreviations by their extend forms, to extend contractions, etc.)
to segment the natural language text into sentences (i.e., to create the tags <SHEAD> (beginning of a sentence), <STAIL> (end of a sentence), <CHEAD> (beginning of a scope) and <CTAIL> (end of a scope) inside the input text). These sentences are used as sentence and clause boundaries, and define the units of processing of the Transformation and Disambiguation grammars.

Examples of Normalization rules

Segmentation
- ("/.*\./",%x):=(%x)(+STAIL,%y); (creates an STAIL node after any sequence of characters followed by "." (/.*\./);
- ("/\(/",%x):=(+CHEAD,%y)(%x); (creates an CHEAD node before the opening of a parentheses (/\(/);
Normalization
- ("an "):=("a "); ("an apple" > "a apple")
- ("don't"):=("do not"); ("I don't see" > "I do not see")

@@ Line 1: / Line 1: @@
+Normalization rules are used to prepare the natural language input for automatic processing. They constitute the preprocessing module that applies over the input as a string and runs prior to the [[tokenization]]. The set of n-rules forms the '''Normalization Grammar''', or '''N-Grammar'''.
+== Syntax ==
+Normalization Rules follow the very general formalism
+ α:=β;
+where the left side α is a condition statement, and the right side β is an action to be performed over α.
+== Type of Normalization Rules ==
-Normalization rules are used to prepare the natural language input for automatic processing. They constitute the preprocessing module that applies over the input as a string and runs prior to the [[tokenization]]. They have two roles:
+== Roles of Normalization Rules ==
+Normalization roles They have two roles:
 *to normalize the input text (to replace abbreviations by their extend forms, to extend contractions, etc.)
 *to segment the natural language text into sentences (i.e., to create the tags <SHEAD> (beginning of a sentence), <STAIL> (end of a sentence), <CHEAD> (beginning of a scope) and <CTAIL> (end of a scope) inside the input text). These sentences are used as sentence and clause boundaries, and define the units of processing of the Transformation and Disambiguation grammars.
-=== Examples of Normalization rules ===
+== Examples of Normalization rules ==
 *Segmentation
 **("/.*\./",%x):=(%x)(+STAIL,%y); (creates an STAIL node after any sequence of characters followed by "." (/.*\./);

N-rule: Difference between revisions

Revision as of 15:29, 31 May 2013

Contents

Syntax

Type of Normalization Rules

Roles of Normalization Rules

Examples of Normalization rules

Navigation menu

Page actions

Page actions

Personal tools

UNL

Search

Lingware

Software

UNL Program

Navigation

Tools

LANGUAGES'

Navigation