Universal Words

From UNL Wiki
Revision as of 13:53, 21 April 2009 by Admin (Talk | contribs)
Jump to: navigation, search

Universal Words, or simply UWs, are the words of UNL, and correspond to the nodes - to be interlinked by relations or modified by attributes - in a UNL graph. They are labels for relatively stable units of knowledge (the concepts) that can be associated to natural language open lexical categories (noun, verb, adjective and adverb). The syntax of UWs is defined by the UNL Specs, but the set of UWs is relatively open and is listed in the UNL Dictionary. Additionally, UWs are organized in a hierarchy (the UNL Ontology), are defined in the UNL Knowledge Base and explained in the UNL Encyclopedia, which are the lexical databases for UNL.

Syntax

UWs can be either simple (atomic) or complex (made out of other UWs). In the latter case, they are represented as hypernodes, i.e., subhypergraphs, and follow the syntax for UNL Sentences. A simple UW is a character-string usually made up of English words that can be split into two different parts: a root and a suffix. The root can be a word, an expression, a phrase or even an entire sentence. It should be interpreted as a label for a set of concepts. The suffix is used to delimit the root in order to be clearly and unambiguously indicated by the UW. It restricts the interpretation of a root to a subset or to a specific concept included within.

The syntax for UWs is defined as follows:

<UW> ::= <root> [<suffix>]
<root> ::= <character>…
<suffix> ::= “(“ <suffix> [ “,” <suffix> ]… “)”
::= <relation> { “>” , “<” } <UW>
<relation> ::= {“agt”, "and", "aoj", ...}
<character> ::= {“A”, ..., “Z”, “a”, ..., “z”, 0, ..., 9, “_”, ” “, “#”, “!”, “$”, "%”, “=”, “^”, "~”, “|”, “@”, “+”, “-“, “<”, “>”, “?”}

where:
< > variable
" " terminal symbol
::= ... is defined as ...
[ ] optional element
{ } alternative element
... to be repeated more than 0 times

Semantics

As natural language words, UWs represent concepts (or sets of concepts). These concepts - although may look very similar from culture to culture - are generally said to be culture-dependent, in that each culture will lead to a very particular way of perceiving and categorizing the world. In principle, the set of UWs, which is the UNL Dictionary, is supposed to be as comprehensive as the set of these different individual concepts depicted by different cultures, no matter how specific they are. In that sense, UWs are not to be considered semantic primitives, nor should represent only common concepts. They must include culture-dependent information and every relevant variation among similar concepts. Furthermore, the UNL Dictionary constitutes an open set, subject to permanent increase with new UWs, as UNL is supposed to incessantly incorporate new cultures and cultural changes.

Software