RC-A1

From UNL Wiki

(Difference between revisions)

Revision as of 21:21, 23 July 2012

The Corpus⁵⁰⁰ is an experimental corpus used to prepare the initial versions of the grammar for sentence-based UNLization and NLization, using IAN and EUGENE, respectively. It comprises a list of 500 sentences in English and their corresponding graphs in UNL, and is supposed to cover very basic linguistic phenomena.

The corpus⁵⁰⁰

Corpus 500 according to the complexity of the graphs

Corpus
Order	Description	Analysis (English original)	Generation (UNL)
0	Training Corpus (Corpus 50)	Corpus 50	Corpus 50
1	Temporary entries	temp_org.txt	temp_unl.txt
2	Entries with no attribute or relation	attribute0_org.txt	attribute0_unl.txt
3	one-attribute entries	attribute1_org.txt	attribute1_unl.txt
4	two-attribute entries	attribute2_org.txt	attribute2_unl.txt
5	three-attribute entries	attribute3_org.txt	attribute3_unl.txt
6	one-relation entries	relation1_org.txt	relation1_unl.txt
7	two-relation entries	relation2_org.txt	relation2_unl.txt
8	three-relation entries	relation3_org.txt	relation3_unl.txt
9	four-relation entries	relation4_org.txt	relation4_unl.txt
10	five-relation entries	relation5_org.txt	relation5_unl.txt
11	six-relation entries	relation6_org.txt	relation6_unl.txt
12	numbers and numerals	numbers_org.txt	numbers_unl.txt
13	expressions of time	time_org.txt	time_unl.txt
14	relative clauses	relatives_org.txt	relatives_unl.txt
15	special issues	problems_org.txt	problems_unl.txt

The whole corpus in one single file
- Corpus500 in English, experimental corpus in English (500 sentences), to be manually translated to the target languages, in order to be used as the input for IAN
- Corpus500 in UNL, experimental corpus in UNL (500 graphs), to be used as the input for EUGENE

@@ Line 2: / Line 2: @@
 == The corpus<sup>500</sup> ==
+*Corpus 500 according to the complexity of the graphs
-*The whole corpus in one single file
-**[http://www.unlweb.net/resources/geneva2012/corpus_eng.txt Corpus500 in English], experimental corpus in English (500 sentences), to be manually translated to the target languages, in order to be used as the input for [[IAN]]
-**[http://www.unlweb.net/resources/geneva2012/corpus_unl.txt Corpus500 in UNL], experimental corpus in UNL (500 graphs), to be used as the input for [[EUGENE]]
-*Corpus 500 according to the complexity of the graphs (the same as above, but split in different files)
 {| border="1" cellpadding="2" align=center
 |+Corpus
@@ Line 94: / Line 90: @@
 |[http://www.unlweb.net/resources/geneva2012/problems.txt problems_unl.txt]
 |}
+*The whole corpus in one single file
+**[http://www.unlweb.net/resources/geneva2012/corpus_eng.txt Corpus500 in English], experimental corpus in English (500 sentences), to be manually translated to the target languages, in order to be used as the input for [[IAN]]
+**[http://www.unlweb.net/resources/geneva2012/corpus_unl.txt Corpus500 in UNL], experimental corpus in UNL (500 graphs), to be used as the input for [[EUGENE]]

RC-A1

Revision as of 21:21, 23 July 2012

The corpus⁵⁰⁰

Views

Personal tools

Search

UNL

Lingware

Software

UNL Program

Navigation

Toolbox

Print/export

RC-A1

Revision as of 21:21, 23 July 2012

The corpus500

Views

Personal tools

Search

UNL

Lingware

Software

UNL Program

Navigation

Toolbox

Print/export

The corpus⁵⁰⁰