Lexica

From UNLwiki
Revision as of 15:19, 21 September 2012 by imported>Martins
Jump to navigationJump to search

The UNL System contains three different types of lexical databases: dictionaries, knowledge bases and example bases.

Background

The lexicon of UNL is formed by the set of permanent UW's, which are expected to represent concepts lexicalized in at least one language. UW's, however, are simply uniform concept identifiers, i.e., arbitrary addresses or names that do not convey, themselves, any information. The meaningfulness of a UW is defined in four different lexical databases, which organize the structure of concepts in two basic levels:

  • Semantic features (monadic predicates), such as semantic class, lexical category, abstractness, polarity, cardinality, etc., are expected to describe the most generic distinctive units of meaning of each concept. They are closely related to the notions of "classeme" (Pottier, 1965), and of "semantic markers" or "classifiers" (Katz & Fodor, 1963). They are used to classify UW's into generic semantic categories that can be used inside the grammar. Semantic features are represented inside the UNL Dictionary.
  • Semantic frames (dyadic predicates) represent a collection of facts that specifies or distinguishes (i.e., "defines") each concept. They represent interactions between concepts that can be either "necessary" or "typical". The set of necessary (essential) interactions constitutes the UNL Knowledge Base; the set of typical (essential and accidental) interactions constitutes the UNL Memory, which includes the UNL Knowledge Base. The difference between "necessary" and "typical" interactions is a matter of logic: an interaction between two concepts X and Y is considered to be "essential" if Y is a logical consequence of X, i.e., if X entails Y; and it is considered to be "typical", if it is simply recurringCite error: Closing </ref> missing for <ref> tag. These three dictionaries are normally made through the UNLarium in different steps and constitute the basic resource for UNLization and NLization.

Knowledge Bases

Main article: UNL Knowledge Base

The UNL Dictionary is simply a flat list of UW's and their corresponding classifiers (such as lexical category, semantic class, abstractness, cardinality, etc.). The UNL Dictionary does not contain any distinguisher, i.e., any information that can be used to differentiate a given UW from the others that belong to the same class. This information is provided in the UNL Knowledge Base, or UNLKB, which is a semantic network made of relations that are necessary to define UW's.

The UNL Knowledge Base is expected to represent the intension (the meaning) of UW's.

The UNL Knowledge Base contains the UNL Ontology, which is a part of the UNLKB where UW's are interconnected by the ontological relations of UNL, i.e., "is-a-kind-of" ("icl") and "is-an-instance-of" ("iof").

Example Bases

In the UNL System, there are two different types of example bases:

  • The UNL Memory is a network of UW's that extends and complements the UNLKB. The difference is that the UNLKB, which is dictionary-based, contains only necessary relations between UW's, whereas the UNL Memory, which is corpus-based, brings any relations between UW's along with their frequency of occurrence. For instance, the idea that a "table" is "supported by one or more vertical legs" is not represented in the UNLKB because it is not supposed to be necessary (there are tables that are not supported by legs). This information, as the information that tables are normally round or square, that they are made of hard materials, etc., is repesented in the UNL Memory, which is expected to represent not only common sense knowledge about UW's, but all the possible instances of a given UW.
  • The UNL-NL Memory is a list of frequent mappings between UNL and a given natural language. It is the UNLization (translation) memory. Differently from the UNL-NLdic, which involves only lexical mappings, the UNL-NL Memory involves any UNLization units, which may include several lexical units.

Notes