N-rule

From UNL Wiki

Revision as of 21:18, 31 May 2013 by Martins (Talk | contribs)

(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)

Normalization rules are used to prepare the natural language input for automatic processing. They constitute the preprocessing module that applies over the input as a string and runs prior to the tokenization. The set of n-rules forms the Normalization Grammar, or N-Grammar.

Syntax

Normalization Rules follow the very general formalism

α:=β;

where the left side α is a condition statement, and the right side β is an action to be performed over α.

Roles of Normalization Rules

Normalization rules have two roles:

to normalize the input text (to replace abbreviations by their extend forms, to extend contractions, etc.)
to segment the natural language text into sentences (i.e., to create the tags <SHEAD> (beginning of a sentence), <STAIL> (end of a sentence), <CHEAD> (beginning of a scope) and <CTAIL> (end of a scope) inside the input text). These tags are used as sentence and clause boundaries, and define the units of processing of the Transformation and Disambiguation grammars.

Type of Normalization Rules

Normalization rules are string replacement rules. They are used to replace existing strings by new strings. They constitute the preprocessing module of natural language analysis, and apply prior to the tokenization and to any dictionary search, when no attribute other than string itself is available. The string to be replaced may be referred by a constant (between "double quotes") or by a regular expression (between /forward slashes/).

LL rules
ACTION	RULE	DESCRIPTION	EXAMPLE
REPLACE	("source string"):=("target string");	All the instances of the source string will be replaced by the target string	("x"):=("y"); axbxcxd will become aybycyd
APPEND (RIGHT)	("source string",%x):=(%x)(%y,"target string");	The target string will be appended to the right of all instances of the source string.	("x",%x):=(%x)("y",%y); axbxcxd will become axybxycxyd
APPEND (LEFT)	("source string",%x):=(%y,"target string")(%x);	The target string will be appended to the left of all instances of the source string.	("x",%x):=("y",%y)(%x); axbxcxd will become ayxbyxcyxd
DELETE	("source string"):=;	All the instances of the source string will be deleted.	("x"):=; axbxcxd will become abcd

Indexes (%x, %y, etc.) are used in appending rules in order to define the direction (to the left or to the right).

Examples of Normalization rules

Segmentation
- ("/.*\./",%x):=(%x)(+STAIL,%y); (creates an STAIL node after any sequence of characters followed by "." (/.*\./);
- ("/\(/",%x):=(+CHEAD,%y)(%x); (creates an CHEAD node before the opening of a parentheses (/\(/);
Normalization
- ("an "):=("a "); ("an apple" > "a apple")
- ("don't"):=("do not"); ("I don't see" > "I do not see")

N-rule

Contents

Syntax

Roles of Normalization Rules

Type of Normalization Rules

Examples of Normalization rules

Views

Personal tools

Search

UNL

Lingware

Software

UNL Program

Navigation

Toolbox

Print/export