NLization

From UNL Wiki
(Difference between revisions)
Jump to: navigation, search
(Level)
(Paradigms)
 
Line 11: Line 11:
 
*Language-based NLization (based mainly in a [[UNL Dictionary|UNL-NL dictionary]] and [[Grammar Specs|UNL-NL grammar]])
 
*Language-based NLization (based mainly in a [[UNL Dictionary|UNL-NL dictionary]] and [[Grammar Specs|UNL-NL grammar]])
 
*Knowledge-based NLization (based mainly in the [[UNL Knowledge Base]])
 
*Knowledge-based NLization (based mainly in the [[UNL Knowledge Base]])
*Example-based NLization (based mainly in the [[UNL Example Base]])
+
*Memory-based NLization (based mainly in the [[UNL-NL Memory]])
*Memory-based NLization (based mainly in the [[UNLization Memory]])
+
 
*Statistical-based NLization (based mainly in statistical predictions derived from UNL-NL corpora)
 
*Statistical-based NLization (based mainly in statistical predictions derived from UNL-NL corpora)
 
*Dialogue-based NLization (based mainly in the interaction with the user)
 
*Dialogue-based NLization (based mainly in the interaction with the user)

Latest revision as of 20:38, 21 September 2012

NLization, formerly known as deconversion, is the process of generating natural language structures corresponding to UNL graphs.

Contents

Units

The process of NLization may have different generation units, as follows:

  • Word-driven NLization
  • Sentence-driven NLization
  • Text-driven NLization

Paradigms

The process of NLization may follow several different paradigms, as follows:

  • Language-based NLization (based mainly in a UNL-NL dictionary and UNL-NL grammar)
  • Knowledge-based NLization (based mainly in the UNL Knowledge Base)
  • Memory-based NLization (based mainly in the UNL-NL Memory)
  • Statistical-based NLization (based mainly in statistical predictions derived from UNL-NL corpora)
  • Dialogue-based NLization (based mainly in the interaction with the user)

The actual NLization is normally hybrid and may combine several of the strategies above.

Recall

The process of NLization may target the whole source document or only parts of it (e.g. main clauses):

  • Full NLization (the whole source document is NLized)
  • Partial (or chunk) NLization (only a part of the source document is NLized)
agt(killed,Peter)obj(killed,Mary)ins(killed,knife)tim(killed,yesterday)
Full NLization: Peter killed Mary with a knife yesterday.
Partial NLization: Peter killed Mary

Precision

The process of NLization may target the deep syntactic structure of the source graph (i.e., the resulting syntactic structure replicates the semantic structure of the original) or only its surface structure (the resulting syntactic structure does not preserve the semantic structure of the original)

  • Deep NLization (the NLization focus the deep syntactic structure of the source graph)
  • Shallow NLization (the NLization focus the surface syntactic structure of the source graph)

Syntactic structures are preserved in the UNL document by the use of syntactic attributes (such as @passive, @topic, etc) or by hyper-nodes (i.e., scopes).

agt(killed.@passive,Peter)obj(killed.@passive,Mary)
Shallow NLization: Peter killed Mary
Deep NLization: Mary was killed by Peter

Level

The process of NLization may target literal meanings (locutionary content) or non-literal meanings (ilocutionary content).

  • Locutionary (the NLization represents only the literal meaning)
  • Ilocutionary (the NLization represents also non-literal meanings, including speech acts)

The ilocutionary force may be represented by figure of speech and speech acts attributes:

agt(pass.@request,you)gol(pass.@request,me)obj(pass.@request,salt)
Locutionary level: Pass me the salt
Ilocutionary level: Can you pass me salt? / Would you pass me the salt?

Methods

Humans and machines may play different roles in NLization methods:

  • Fully automatic NLization (the whole process is carried out by the machine, without any intervention of the human user)
  • Human-aided machine NLization (the process is carried mainly by the machine, with some intervention of the human user, either as a pre-editor or as a post-editor, or during the NLization itself, as in dialogue-based NLization)
  • Machine-aided human NLization (the process is carried mainly by the human user, with some help of the machine, as in the dictionary or memory look-up)
  • Fully human NLization (the whole process is carried by the human user, without any intervention of the machine)

Tools

For the time being, there are two NLization tools, as described below:

Tool Unit Paradigms Recall Precision Level Method Licence Author
EUGENE sentence LB,KB,EB,MB,DB F,P D,S L,I FA,HA freeware UNDLF
DeCo sentence LB,KB F D L,I FA shareware UNLC
Software