UNL Knowledge Base
Line 1: | Line 1: | ||
− | The '''UNL Knowledge Base''', or simply '''UNLKB''', is a network structure where UWs are interconnected through any semantic relation of UNL. In that sense, the UNL KB comprises and extends the [[UNL Ontology]], which deals only with ontological relations. The UNLKB is claimed to improve the results of both the [[enconversion]] and the [[deconversion]] process, as it would provide them with extralinguistic information normally required for solving ambiguities, anaphora and co-reference in natural language analysis and generation. | + | The '''UNL Knowledge Base''', or simply '''UNLKB''', is a network structure where UWs are interconnected through any semantic relation of UNL. In that sense, the UNL KB comprises and extends the [[UNL Ontology]], which deals only with ontological relations. The UNLKB is claimed to improve the results of both the [[enconversion]] and the [[deconversion]] process, as it would provide them with extralinguistic information normally required for solving ambiguities, anaphora and co-reference in natural language analysis and generation. The UNL KB may be provided in two different formats: |
− | + | ||
− | The UNL KB may be provided in two different formats: | + | |
*Extended, in XML; or | *Extended, in XML; or | ||
*Simplified, as a set of [[Grammar_Specs#Disambiguation_Rules|network disambiguation rules]] | *Simplified, as a set of [[Grammar_Specs#Disambiguation_Rules|network disambiguation rules]] |
Revision as of 10:47, 8 December 2010
The UNL Knowledge Base, or simply UNLKB, is a network structure where UWs are interconnected through any semantic relation of UNL. In that sense, the UNL KB comprises and extends the UNL Ontology, which deals only with ontological relations. The UNLKB is claimed to improve the results of both the enconversion and the deconversion process, as it would provide them with extralinguistic information normally required for solving ambiguities, anaphora and co-reference in natural language analysis and generation. The UNL KB may be provided in two different formats:
- Extended, in XML; or
- Simplified, as a set of network disambiguation rules
Contents |
Extended format
UNL KB entries in extended format must have the following structure:
<relation name="RNAME" type="RTYPE" frequency="RFREQ"> <source id="SID" attribute="ATT" lang="UNL" frequency="SFREQ" class="SCLASS">SOURCE</source> <target id="TID" attribute="ATT" lang="UNL" frequency="TFREQ" class="TCLASS">TARGET</target> </relation>
Where:
RNAME is the name of one existing UNL relation ("agt", "aoj", "obj", etc);
RTYPE is the type of the existing relation
RFREQ is the frequency of the relation TYPE between the SOURCE and the TARGET in the corpus;
SFREQ is the frequency of the SOURCE in the corpus;
TFREQ is the frequency of the TARGET in the corpus;
SID is a number used to identify the SOURCE;
TID is a number used to identify the TARGET;
ATT is one of the existing UNL attributes ("entry", "past", etc);
SCLASS is the general class of the SOURCE;
TCLASS is the general class of the TARGET;
SOURCE is the source node of the UNL relation;
TARGET is the target node of the UNL relation;
XML Schema
<?xml version="1.0" encoding="utf-16"?> <xsd:schema attributeFormDefault="unqualified" elementFormDefault="qualified" version="1.0" xmlns:xsd="http://www.w3.org/2001/XMLSchema"> <xsd:element name="kb"> <xsd:complexType> <xsd:sequence> <xsd:element maxOccurs="unbounded" name="relation"> <xsd:complexType> <xsd:sequence> <xsd:element name="source"> <xsd:complexType> <xsd:attribute name="id" type="xsd:unsignedLong" use="required" /> <xsd:attribute name="attribute" type="xsd:string" use="optional" /> <xsd:attribute name="lang" type="xsd:string" use="optional" /> <xsd:attribute name="frequency" type="xsd:int" use="optional"/> <xsd:attribute name="class" type="xsd:string" use="optional"/> </xsd:complexType> </xsd:element> <xsd:element name="target"> <xsd:complexType> <xsd:attribute name="id" type="xsd:unsignedLong" use="required"/> <xsd:attribute name="attribute" type="xsd:string" use="optional" /> <xsd:attribute name="lang" type="xsd:string" use="optional"/> <xsd:attribute name="frequency" type="xsd:int" use="optional"/> <xsd:attribute name="class" type="xsd:string" use="optional"/> </xsd:complexType> </xsd:element> </xsd:sequence> <xsd:attribute name="name" type="xsd:string" use="required"/> <xsd:attribute name="type" type="xsd:string" use="optional"/> <xsd:attribute name="frequency" type="xsd:int" use="optional"/> </xsd:complexType> </xsd:element> </xsd:sequence> </xsd:complexType> </xsd:element> </xsd:schema>
Example
<?xml version="1.0" encoding="utf-16"?> <kb> <relation name="mod" frequency="2"> <source id="410" attribute="entry" lang="UNL" frequency="20" class="nou">book(icl>document)</source> <target id="2243" lang="UNL" frequency="2" class="nou">republic(icl>form of government)</target> </relation> <relation name="mod" frequency="1"> <source id="410" attribute="entry" lang="UNL" frequency="20" class="nou">book(icl>document)</source> <target id="466" lang="UNL" frequency="1" class="nou">certainty(icl>attribute)</target> </relation> <relation name="mod" frequency="1"> <source id="410" attribute="entry" lang="UNL" frequency="20" class="nou">book(icl>document)</source> <target id="583" lang="UNL" frequency="2" class="nou">creation(icl>action)</target> </relation> <relation name="mod" frequency="1"> <source id="410" attribute="entry" lang="UNL" frequency="20" class="nou">book(icl>document)</source> <target id="1539" lang="UNL" frequency="3" class="nou">lineage(icl>descendant)</target> </relation> <relation name="mod" frequency="1"> <source id="410" attribute="entry" lang="UNL" frequency="20" class="nou">book(icl>document)</source> <target id="1566" lang="UNL" frequency="11" class="nou">love(icl>emotion)</target> </relation> </kb>
Simplified format
UNL KB entries in simplified format must have the following structure:
RELATION(SOURCE;TARGET)=DC;
Where:
RELATION is the name of one existing UNL relation ("agt", "aoj", "obj", etc);
SOURCE is the source node of the UNL relation;
TARGET is the target node of the UNL relation;
DC is the degree of certainty (i.e., the likelihood of the relation between the SOURCE and the TARGET)