XII UNL School
From UNL Wiki
(Difference between revisions)
Line 33: | Line 33: | ||
**[http://www.unlweb.net/school/geneva2013/day4.pdf Presentation] | **[http://www.unlweb.net/school/geneva2013/day4.pdf Presentation] | ||
*5/07/2013 | *5/07/2013 | ||
− | **[http://www.unlweb.net/school/geneva2013/ | + | **[http://www.unlweb.net/school/geneva2013/day5.pdf Presentation] |
== PARTICIPANTS == | == PARTICIPANTS == | ||
Line 49: | Line 49: | ||
*Yordanka Stancheva (Bulgarian) | *Yordanka Stancheva (Bulgarian) | ||
+ | == VENUE == | ||
+ | UNDL Foundation Office, Geneva | ||
+ | == POST-WORKSHOP TASKS == | ||
+ | Deadline = 30/09/2013 | ||
+ | *Open-Class Word List (3,000 word forms) | ||
+ | *Corpus NC-A1 | ||
+ | **Original corpus: 5-10 original articles from the Wikipedia about culture-specific subjects (minimum of 2,500 words), in separate files, in plain text format with UTF-8 encoding | ||
+ | **List of noun phrases appearing in the corpus (length of the NP >= 2, and NP's must not contain verbs) | ||
+ | == PROJECTS == | ||
+ | The following projects will be open upon the accomplishment of the post-workshop tasks | ||
+ | *BRUNO-A1 (open only for languages where number of subcategorization frames (all languages) > 15 and number of paradigms (inflectional languages) > 15): 2,000 entries (around 4,000 UNLdots) | ||
+ | *NC-A1: 1,000 entries (3,000 UNLdots) | ||
+ | == ADDITIONAL MATERIAL == | ||
+ | === Open Class Word List === | ||
+ | Extract from the most frequent words in Wikipedia | ||
− | |||
− | |||
{|table border=1 cellpadding=5 | {|table border=1 cellpadding=5 | ||
!Language | !Language | ||
Line 87: | Line 100: | ||
|} | |} | ||
− | == SSS Examples == | + | === SSS Examples === |
{|table border=1 cellpadding=5 | {|table border=1 cellpadding=5 | ||
!sentence | !sentence | ||
Line 117: | Line 130: | ||
|} | |} | ||
− | == UNL Examples == | + | === UNL Simplified Examples === |
{|table border=1 cellpadding=5 | {|table border=1 cellpadding=5 | ||
!sentence | !sentence | ||
− | ! | + | !UNL |
|- | |- | ||
|book | |book |
Revision as of 18:33, 4 July 2013
The UNDL Foundation invites applications for the XII UNL School, to take place in Geneva, Switzerland, from June 17th to 21th, 2013. This is an intermediate-level workshop dedicated to the improvement of grammatical resources already existing in the UNL framework. The UNDL Foundation will pay the travel and accommodation expenses for the selected candidates not living in Geneva.
Contents |
IMPORTANT DATES
12/05/2013: Deadline for the applications20/05/2013: Notification of accepted candidates- 1-5/07/2013: XII UNL School
GOALS
- To compile the corpus NC-A1
- To prepare the basic modules for the UNLization of the corpus NC-A1
- To prepare the basic modules for the NLization of the corpus NC-A1
PROGRAM
- 1/07/2013: Normalization Grammar
- 2/07/2013: Closed-Class Dictionary
- 3/07/2013: Open-Class Word List
- 4/07/2013: Grammar NC-A1
- 5/07/2013: Evaluation and discussion
MATERIAL
- 1/07/2013
- Presentation
- Exercise #1 (text to be normalized)
- Exercise #2 (normalization grammar for English)
- 2/07/2013
- Presentation
- Exercise #3 (English Closed-Class Dictionary)
- 3/07/2013
- Presentation
- Exercise #4 (Open-Class Word List)
- 4/07/2013
- 5/07/2013
PARTICIPANTS
- Kim Sokphyrum (Khmer)
- Marwa Saber (Arabic)
- Muhammad Zulhelmy Bin Mohd Rosman (Malay)
- Ofelia Hovhannisyan (Armenian)
- Parameswarappa S (Kannada)
- Parteek Kumar (Panjabi)
- Ronaldo Martins (UNL)
- Sameh Alansary (Arabic)
- Serhii Prots (Ukrainian)
- Suos Samak (Khmer)
- Teng Wei Min (Chinese)
- Yordanka Stancheva (Bulgarian)
VENUE
UNDL Foundation Office, Geneva
POST-WORKSHOP TASKS
Deadline = 30/09/2013
- Open-Class Word List (3,000 word forms)
- Corpus NC-A1
- Original corpus: 5-10 original articles from the Wikipedia about culture-specific subjects (minimum of 2,500 words), in separate files, in plain text format with UTF-8 encoding
- List of noun phrases appearing in the corpus (length of the NP >= 2, and NP's must not contain verbs)
PROJECTS
The following projects will be open upon the accomplishment of the post-workshop tasks
- BRUNO-A1 (open only for languages where number of subcategorization frames (all languages) > 15 and number of paradigms (inflectional languages) > 15): 2,000 entries (around 4,000 UNLdots)
- NC-A1: 1,000 entries (3,000 UNLdots)
ADDITIONAL MATERIAL
Open Class Word List
Extract from the most frequent words in Wikipedia
Language | File |
---|---|
Arabic | ar_words.xls |
Armenian | hy_words.xls |
Bulgarian | bg_words.xls |
Chinese | zh_words.xls |
Kannada | kn_words.xls |
Khmer | km_words.xls |
Malay | ms_words.xls |
Punjabi | pa_words.xls |
Ukrainian | uk_words.xls |
SSS Examples
sentence | SSS |
---|---|
book | NH(book) |
the book | NS(book;the) |
beautiful book | NA(book;beautiful) |
book of John | NA(book;:01) PC:01(of;John) |
the book of John | NS(book;the) NA(book;:01) PC:01(of;John) |
the beautiful book of John | NS(book;the) NA(book;beautiful) NA(book;:01) PC:01(of;John) |
the book of Math of John | NS(book;the) NA(book;:01) PC:01(of;Math) NA(book;:02) PC:02(of;John) |
the book about the construction of Babel | NS(book;the) NA(book;:01) PC:01(about;:02) NS:02(construction;the) NA:02(construction;:03) PC:03(of;Babel) |
UNL Simplified Examples
sentence | UNL |
---|---|
book | book |
the book | book.@def |
beautiful book | mod(book;beautiful) |
book of John | pos(book;John) |
the book of John | pos(book.@def;John) |
the beautiful book of John | mod(book.@def;beautiful) pos(book.@def;John) |
the book of Math of John | cnt(book.@def;Math) pos(book.@def;John) |
the book about the construction of Babel | cnt(book.@def;:01) obj(construction.@def;Babel) |