I UNL Olympiad

From UNL Wiki

(Difference between revisions)

Revision as of 15:38, 30 October 2012

The UNL Olympiad is a series of competitions organised by the UNDL Foundation in order to foster the development of UNL-driven resources (dictionaries, grammars and corpora). The first edition of the Olympiad is devoted to the development of grammars for the corpus UC-A1, comprising 100 sentences. The competition is open to any participant, and the deadline is January 30th, 2013.

Important dates

October 30th, 2012: Call for Participation
January 30th, 2013: Deadline for submitting the grammars

Languages

The I UNL Olympiad will be dedicated to the development of grammars for the following languages:

Assamese
Baatonum
Bengali
Bulgarian
Chinese
Croatian
Dutch
German
Gujarati
Hindi
Hungarian
Indonesian
Italian
Japanese
Kashmiri
Malayalam
Marathi
Oriya
Persian
Polish
Romanian
Russian
Sanskrit
Sindhi
Slovak
Swahili
Swedish
Tamil
Telugu
Thai
Turkish
Ukrainian

Prizes

Prizes are awarded to the best grammars of each modality (UNLization and NLization) for each language:

1st place: Gold Medal and USD500.00
2nd place: Silver Medal
3rd place: Bronze Medal

Additionally, the authors of the three best UNLization Grammars among all languages and the authors of the three best NLization Grammars among all languages will also be invited to participate in the next intermediate-level grammar workshop, to be held in Geneva, Switzerland, on May 2013.

Rules

The competition is free and open to any participant.
Candidates may participate in one or two modalities, i.e., they may work with the UNLization grammar, with the NLization grammar, or with both.
Candidates may participate in more than one language.

Registration

No previous registration is required. Registration is done by sending the following files to olympiad@undlfoundation.org until 23:59:59 (UTC) of January 30th, 2013.

For the participants working with the UNLization grammar (IAN):
- UCA1_<LID>.txt, with the human translation, to the target language, of the sentences of the Corpus UC-A1;
- <LID>_unl_dic.txt, with the natural language analysis dictionary used to UNLize the translated version of the Corpus UC-A1;
- <LID>_unl_tgrammar.txt, with the transformation grammar used to UNLize the translated version of the Corpus UC-A1;
- <LID>_unl_dgrammar.txt, with the disambiguation grammar used to UNLize the translated version of the Corpus UC-A1;
- <LID>_unl_output.txt, with the output provided by IAN
For the participants working with the NLization grammar (EUGENE)
- UCA1_<LID>.txt, with the human translation, to the target language, of the sentences of the Corpus UC-A1;
- unl_<LID>_dic.txt, with the natural language generation dictionary used to NLize the UNL version of Corpus UC-A1;
- unl_<LID>_tgrammar.txt, with the transformation grammar used to NLize the UNL version of the Corpus UC-A1;
- unl_<LID>_dgrammar.txt, with the disambiguation grammar used to NLize the UNL version of the Corpus UC-A1;
- unl_<LID>_output.txt, with the output provided by EUGENE

Where <LID> must be replaced by the three-character language according to ISO 639-3.^[1].
All files must be provided in UTF-8.

Requisites

The files must comply with the following requisites:

The corpus must comply with the translation standards of the target language and should not be artificially translated in order to provoke better results.
The dictionary files must comply with the Dictionary Specs and may only bring features present in the Tagset. They should not contain temporary words.
The grammar files must comply with the Grammar Specs and must be as generic possible. They should not target only the corpus.
The F-Measure of the grammars must be equal or greater than 0.8.

Evaluation

Grammars will be evaluated and ranked according to the following criteria:

Best F-Measure
Scalability, in case of grammars with the same F-Measure
Date of submission, in case of grammars with the same F-Measure and equally scalable

Notes

↑ For instance, the files to be provided by Russian (code = "rus") must be UCA1_rus.txt, rus_unl_dic.txt, rus_unl_tgrammar.txt, etc.

Instructions

The authors of the grammars with the best F-measures for each language will receive medals. the prize of USD500.00 and the right to participate in the

[0] For instance, the files to be provided by Russian (code = "rus") must be UCA1_rus.txt, rus_unl_dic.txt, rus_unl_tgrammar.txt, etc.

[1]

@@ Line 57: / Line 57: @@
 #Candidates may participate in one or two modalities, i.e., they may work with the UNLization grammar, with the NLization grammar, or with both.
 #Candidates may participate in more than one language.
-#No previous registration is required. Registration is done by sending the following files to olympiad@undlfoundation.org until 23:59:59 (UTC) of January 30th, 2013.
+== Registration ==
+No previous registration is required. Registration is done by sending the following files to olympiad@undlfoundation.org until 23:59:59 (UTC) of January 30th, 2013.
+*For the participants working with the UNLization grammar (IAN):
+**UCA1_<LID>.txt, with the human translation, to the target language, of the sentences of the Corpus UC-A1;
+**<LID>_unl_dic.txt, with the natural language analysis dictionary used to UNLize the translated version of the Corpus UC-A1;
+**<LID>_unl_tgrammar.txt, with the transformation grammar used to UNLize the translated version of the Corpus UC-A1;
+**<LID>_unl_dgrammar.txt, with the disambiguation grammar used to UNLize the translated version of the Corpus UC-A1;
+**<LID>_unl_output.txt, with the output provided by IAN
+*For the participants working with the NLization grammar (EUGENE)
+**UCA1_<LID>.txt, with the human translation, to the target language, of the sentences of the Corpus UC-A1;
+**unl_<LID>_dic.txt, with the natural language generation dictionary used to NLize the UNL version of Corpus UC-A1;
+**unl_<LID>_tgrammar.txt, with the transformation grammar used to NLize the UNL version of the Corpus UC-A1;
+**unl_<LID>_dgrammar.txt, with the disambiguation grammar used to NLize the UNL version of the Corpus UC-A1;
+**unl_<LID>_output.txt, with the output provided by EUGENE
+Where <LID> must be replaced by the three-character language according to [http://en.wikipedia.org/wiki/List_of_ISO_639-3_codes ISO 639-3].<ref>For instance, the files to be provided by Russian (code = "rus") must be UCA1_rus.txt, rus_unl_dic.txt, rus_unl_tgrammar.txt, etc.</ref>.<br />
+All files must be provided in UTF-8.
+== Requisites ==
+The files must comply with the following requisites:
+*The corpus must comply with the translation standards of the target language and should not be artificially translated in order to provoke better results.
+*The dictionary files must comply with the [[Dictionary Specs]] and may only bring features present in the [[Tagset]]. They should not contain temporary words.
+*The grammar files must comply with the [[Grammar Specs]] and must be as generic possible. They should not target only the corpus.
+*The [[F-Measure]] of the grammars must be equal or greater than 0.8.
+== Evaluation ==
+Grammars will be evaluated and ranked according to the following criteria:
+*Best [[F-Measure]]
+*Scalability, in case of grammars with the same F-Measure
+*Date of submission, in case of grammars with the same F-Measure and equally scalable
+== Notes ==
+<references />
 == Instructions ==

I UNL Olympiad

Revision as of 15:38, 30 October 2012

Contents

Important dates

Categories

Languages

Prizes

Rules

Registration

Requisites

Evaluation

Notes

Instructions

Views

Personal tools

Search

UNL

Lingware

Software

UNL Program

Navigation

Toolbox

Print/export