Normalization

From UNL Wiki

(Difference between revisions)

Martins (Talk | contribs)
(Created page with "Normalization is the process of normalizing the input document in order to be better processed. It is carried by N-rules and includes: *replacing abbreviations by their co...")
Newer edit →

Revision as of 17:00, 16 July 2014

Normalization is the process of normalizing the input document in order to be better processed. It is carried by N-rules and includes:

replacing abbreviations by their corresponding extended forms
replacing short forms by their corresponding long forms
replacing periphrases direct forms
replacing contractions by their components
defining processing units

Replacement

Replacement is carried by N-rules written as follows:

({SHEAD|" "})("don’t")({STAIL|" "}):=()("do not")();
({SHEAD|" "})("art. ")({STAIL|" "}):=()("article")();
({SHEAD|" "})("aux")({STAIL|" "}):=()("à les")();

Where:

SHEAD = beginning of the sentence
STAIL = end of the sentence
({SHEAD|" "}) indicates left context (i.e., either SHEAD or blank space)
({STAIL|" "}) indicates right context (i.e., either SHEAD or blank space)

Normalization

Revision as of 17:00, 16 July 2014

Replacement

Views

Personal tools

Search

UNL

Lingware

Software

UNL Program

Navigation

Toolbox

Print/export