UNL-NL Memory: Difference between revisions
| imported>Martins No edit summary | imported>Martins No edit summary | ||
| (7 intermediate revisions by the same user not shown) | |||
| Line 1: | Line 1: | ||
| The ''' | The '''UNL<->NL Memory''' is a set of mappings between a given natural language and UNL. It may be unidirectional (UNL-NL Memory or NL-UNL Memory) or bidirectional (UNL<->NL Memory). It is used to improve and normalize the results of the [[UNLization]] and the [[NLization]], as it contain segments that have been previously UNLized or NLized.<br/><br /> | ||
| The UNL<->NL Memory may be provided in two different formats: | |||
| *Extended, in TMX; or | *Extended, in TMX; or | ||
| *Simplified, as a set of [[Grammar_Specs#Disambiguation_Rules|network disambiguation rules]] | *Simplified, as a set of [[Grammar_Specs#Disambiguation_Rules|network disambiguation rules]] | ||
| Line 7: | Line 8: | ||
| == Extended format == | == Extended format == | ||
| UNL  | UNL<->NL Memory entries in extended format must comply with the [http://www.gala-global.org/lisa-oscar-standards Translation Memory eXchange Specs], as follows: | ||
|      <tu> |      <tu> | ||
| Line 21: | Line 22: | ||
| <seg> is the beginning of the translation segment<br /> | <seg> is the beginning of the translation segment<br /> | ||
| </seg> is the end of the translation segment<br /> | </seg> is the end of the translation segment<br /> | ||
| == Simplified format == | == Simplified format == | ||
| UNL  | UNL<->NL Memory entries in simplified format must be represented as a set of [[Grammar_Specs#Disambiguation_Rules|network disambiguation rules]], as follows: | ||
|   equ(SOURCE;TARGET)=DC; |   equ(SOURCE;TARGET)=DC; | ||
Latest revision as of 12:12, 7 April 2014
The UNL<->NL Memory is a set of mappings between a given natural language and UNL. It may be unidirectional (UNL-NL Memory or NL-UNL Memory) or bidirectional (UNL<->NL Memory). It is used to improve and normalize the results of the UNLization and the NLization, as it contain segments that have been previously UNLized or NLized.
The UNL<->NL Memory may be provided in two different formats:
- Extended, in TMX; or
- Simplified, as a set of network disambiguation rules
Extended format
UNL<->NL Memory entries in extended format must comply with the Translation Memory eXchange Specs, as follows:
   <tu>
       <tuv xml:lang="en"><seg>a good deal</seg><tuv>
       <tuv xml:lang="unl"><seg>400059171</seg><tuv>
   </tu>
    
Where:
<tu> is the beginning of the translation unit
</tu> is the end of the translation unit
<tuv> is the beginning translation unit variant
</tuv> is the end of the translation unit variant
<seg> is the beginning of the translation segment
</seg> is the end of the translation segment
Simplified format
UNL<->NL Memory entries in simplified format must be represented as a set of network disambiguation rules, as follows:
equ(SOURCE;TARGET)=DC;
Where:
equ is the UNL relation for "equivalent";
SOURCE is the source segment;
TARGET is the target segment; 
DC is the degree of certainty (i.e., the likelihood of the relation between the SOURCE and the TARGET)