UNL Dictionary

From UNL Wiki
(Difference between revisions)
Jump to: navigation, search
Line 8: Line 8:
  
 
== Structure ==
 
== Structure ==
The structure of the UNL Dictionary is defined by the [[Dictionary Specs]].
+
 
 +
<`UWID` int(11) NOT NULL AUTO_INCREMENT COMMENT 'UWID',
 +
                  `UW` varchar(255) COLLATE utf8_unicode_ci NOT NULL COMMENT 'Universal Word',
 +
                  `HEADWORD` varchar(255) COLLATE utf8_unicode_ci DEFAULT NULL COMMENT 'Headword',
 +
                  `SYNSET` varchar(255) COLLATE utf8_unicode_ci NOT NULL,
 +
                  `ROOT` varchar(255) COLLATE utf8_unicode_ci NOT NULL,
 +
                  `SEMSTR` varchar(255) COLLATE utf8_unicode_ci NOT NULL,
 +
                  `HYPERNYM` varchar(255) COLLATE utf8_unicode_ci NOT NULL,
 +
                  `DEFINITION` text COLLATE utf8_unicode_ci COMMENT 'Definition',
 +
                  `EXAMPLE` text COLLATE utf8_unicode_ci COMMENT 'Example',
 +
                  `LEX` char(1) COLLATE utf8_unicode_ci NOT NULL COMMENT 'Lexical Category',
 +
                  `ABN` varchar(4) COLLATE utf8_unicode_ci DEFAULT NULL COMMENT 'Abstractness',
 +
                  `ALY` varchar(4) COLLATE utf8_unicode_ci DEFAULT NULL COMMENT 'Alienability',
 +
                  `ANI` varchar(4) COLLATE utf8_unicode_ci DEFAULT NULL COMMENT 'Animacy',
 +
                  `CAR` varchar(4) COLLATE utf8_unicode_ci DEFAULT NULL COMMENT 'Cardinality',
 +
                  `GEN` char(3) COLLATE utf8_unicode_ci DEFAULT NULL COMMENT 'Gender',
 +
                  `POL` char(3) COLLATE utf8_unicode_ci DEFAULT NULL COMMENT 'Polarity',
 +
                  `SEM` char(3) COLLATE utf8_unicode_ci DEFAULT NULL COMMENT 'Semantic Category',
 +
                  `SFR` varchar(255) COLLATE utf8_unicode_ci DEFAULT NULL COMMENT 'Semantic Frame',
 +
                  `SLANGUAGE` char(2) COLLATE utf8_unicode_ci DEFAULT NULL COMMENT 'Source Language',
 +
                  `PROJECT` varchar(255) COLLATE utf8_unicode_ci DEFAULT NULL COMMENT 'Project',
 +
                  `AUTHOR` varchar(255) COLLATE utf8_unicode_ci DEFAULT NULL COMMENT 'Author',
 +
                  `EDITOR` varchar(255) COLLATE utf8_unicode_ci DEFAULT NULL COMMENT 'Editor',
 +
                  `REVISOR` varchar(255) COLLATE utf8_unicode_ci DEFAULT NULL COMMENT 'Revisor',
 +
                  `CREATED` date DEFAULT NULL COMMENT 'Created',
 +
                  `EDITED` date DEFAULT NULL COMMENT 'Edited',
 +
                  `REVISED` date DEFAULT NULL COMMENT 'Revised',
 +
                  `STATUS` int(1) DEFAULT NULL COMMENT 'Status',
 +
                  `PROBLEM` text COLLATE utf8_unicode_ci COMMENT 'Problem',
 +
                  `BUG_REPORTER` text COLLATE utf8_unicode_ci COMMENT 'Bug Reporter',
 +
                  `IMAGE` text COLLATE utf8_unicode_ci COMMENT 'Image',
 +
                  `COMMENTS` text COLLATE utf8_unicode_ci COMMENT 'Comments',
 +
                  `DECLINED` int(11) DEFAULT NULL,
 +
                  `REFUSED` int(11) NOT NULL DEFAULT '0',
 +
                  `FREQUENCY` int(11) DEFAULT NULL COMMENT 'Frequency',
 +
                  `ASSIGNMENT` int(11) DEFAULT NULL,
 +
                  `CORE` int(1) DEFAULT NULL,
 +
                  `UNIVERSALITY` tinyint(1) NOT NULL,
 +
                  `DATE` timestamp NULL DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP COMMENT 'Updated',
 +
                  PRIMARY KEY (`UWID`),

Revision as of 19:04, 27 June 2013

The UNL Dictionary, or UD, is the inventory of the permanent Universal Words. It is a flat list of UWs in alphabetical order with their corresponding semantic features, but without any further organization or structure, which is expected to be provided by the other UNL lexical databases (the UNL Knowledge Base and the UNL Memory, namely).

Subdivisions

The UNL Dictionary is divided into three nested repositories:

  • The UNL Core Dictionary contains only permanent simple UWs that are (presumably) shared by all languages
  • The UNL Abridged Dictionary contains all permanent UWs (simple, compound or complex) that are shared by at least two different language families
  • The UNL Unabridged Dictionary contains all permanent UWs (simple, compound or complex) that are lexicalized in at least one language

Structure

<`UWID` int(11) NOT NULL AUTO_INCREMENT COMMENT 'UWID',

                 `UW` varchar(255) COLLATE utf8_unicode_ci NOT NULL COMMENT 'Universal Word',
                 `HEADWORD` varchar(255) COLLATE utf8_unicode_ci DEFAULT NULL COMMENT 'Headword',
                 `SYNSET` varchar(255) COLLATE utf8_unicode_ci NOT NULL,
                 `ROOT` varchar(255) COLLATE utf8_unicode_ci NOT NULL,
                 `SEMSTR` varchar(255) COLLATE utf8_unicode_ci NOT NULL,
                 `HYPERNYM` varchar(255) COLLATE utf8_unicode_ci NOT NULL,
                 `DEFINITION` text COLLATE utf8_unicode_ci COMMENT 'Definition',
                 `EXAMPLE` text COLLATE utf8_unicode_ci COMMENT 'Example',
                 `LEX` char(1) COLLATE utf8_unicode_ci NOT NULL COMMENT 'Lexical Category',
                 `ABN` varchar(4) COLLATE utf8_unicode_ci DEFAULT NULL COMMENT 'Abstractness',
                 `ALY` varchar(4) COLLATE utf8_unicode_ci DEFAULT NULL COMMENT 'Alienability',
                 `ANI` varchar(4) COLLATE utf8_unicode_ci DEFAULT NULL COMMENT 'Animacy',
                 `CAR` varchar(4) COLLATE utf8_unicode_ci DEFAULT NULL COMMENT 'Cardinality',
                 `GEN` char(3) COLLATE utf8_unicode_ci DEFAULT NULL COMMENT 'Gender',
                 `POL` char(3) COLLATE utf8_unicode_ci DEFAULT NULL COMMENT 'Polarity',
                 `SEM` char(3) COLLATE utf8_unicode_ci DEFAULT NULL COMMENT 'Semantic Category',
                 `SFR` varchar(255) COLLATE utf8_unicode_ci DEFAULT NULL COMMENT 'Semantic Frame',
                 `SLANGUAGE` char(2) COLLATE utf8_unicode_ci DEFAULT NULL COMMENT 'Source Language',
                 `PROJECT` varchar(255) COLLATE utf8_unicode_ci DEFAULT NULL COMMENT 'Project',
                 `AUTHOR` varchar(255) COLLATE utf8_unicode_ci DEFAULT NULL COMMENT 'Author',
                 `EDITOR` varchar(255) COLLATE utf8_unicode_ci DEFAULT NULL COMMENT 'Editor',
                 `REVISOR` varchar(255) COLLATE utf8_unicode_ci DEFAULT NULL COMMENT 'Revisor',
                 `CREATED` date DEFAULT NULL COMMENT 'Created',
                 `EDITED` date DEFAULT NULL COMMENT 'Edited',
                 `REVISED` date DEFAULT NULL COMMENT 'Revised',
                 `STATUS` int(1) DEFAULT NULL COMMENT 'Status',
                 `PROBLEM` text COLLATE utf8_unicode_ci COMMENT 'Problem',
                 `BUG_REPORTER` text COLLATE utf8_unicode_ci COMMENT 'Bug Reporter',
                 `IMAGE` text COLLATE utf8_unicode_ci COMMENT 'Image',
                 `COMMENTS` text COLLATE utf8_unicode_ci COMMENT 'Comments',
                 `DECLINED` int(11) DEFAULT NULL,
                 `REFUSED` int(11) NOT NULL DEFAULT '0',
                 `FREQUENCY` int(11) DEFAULT NULL COMMENT 'Frequency',
                 `ASSIGNMENT` int(11) DEFAULT NULL,
                 `CORE` int(1) DEFAULT NULL,
                 `UNIVERSALITY` tinyint(1) NOT NULL,
                 `DATE` timestamp NULL DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP COMMENT 'Updated',
                 PRIMARY KEY (`UWID`),
Software