Universal Words
In a UNL graph, the nodes - which are referred to as "Universal Words", or simply "UWs" - play the role of isolated concepts in human cognition. They correspond to relatively stable units of knowledge that can be associated to natural language open lexical categories (noun, verb, adjective and adverb). UWs can be either simple (atomic) or complex (made out of other UWs). In the latter case, they are represented as a hypernode, i.e., as a subhypergraph. UWs have also been claimed to be universal, in the sense they could be expressed by any natural language, either as a single word or as an entire description. The syntax of UWs is defined by the UNL Specs, but the set of UWs is relatively open.
Syntax
According to the UNL Specs, a UW is a character-string usually made up of English words that can be split into two different parts: a root and a suffix. The root can be a word, an expression, a phrase or even an entire sentence. It should be interpreted as a label for a set of concepts: the set made up of all the concepts that may correspond to that in its original language. The root indicates hence an entire range of references that can be referred to. The suffix is used to delimit a concept within that range in order to be clearly and unambiguously indicated by the UW. It restricts the interpretation of a root to a subset or to a specific concept included within.
The syntax for UWs is defined as follows:
<UW> | ::= | <root> [<suffix>] |
<root> | ::= | <character>… |
<suffix> | ::= | “(“ <suffix> [ “,” <suffix> ]… “)” |
::= | <relation> { “>” , “<” } <UW> | |
<relation> | ::= | {“agt”, "and", "aoj", ...} |
<character> | ::= | {“A”, ..., “Z”, “a”, ..., “z”, 0, ..., 9, “_”, ” “, “#”, “!”, “$”, "%”, “=”, “^”, "~”, “|”, “@”, “+”, “-“, “<”, “>”, “?”} |
where:
< > variable
" " terminal symbol
::= ... is defined as ...
[ ] optional element
{ } alternative element
... to be repeated more than 0 times
Semantics
As natural language words, UWs represent concepts (or sets of concepts). These concepts - although may look very similar from culture to culture - are generally said to be culture-dependent, in that each culture will lead to a very particular way of perceiving and categorizing the world. It would be hardly possible to find exactly the same set of concepts in two different languages, no matter how close they can be. In principle, the set of UWs, which is the UNL Dictionary, is supposed to be as comprehensive as the set of these different individual concepts depicted by different cultures,no matter how specific these concepts can be. Every concept existing in any language must correspond to a UW. In that sense, UWs are not to be considered semantic primitives, nor should represent only common concepts. They must include culture-dependent information and every relevant variation among similar concepts. Specific concepts are not expected to be approximated (or reduced) to general concepts, and should be kept defined as UWs. Furthermore, the UNL Dictionary constitutes an open set, subject to permanent increase with new UWs, as UNL is supposed to incessantly incorporate new cultures and cultural changes.