FRIDA
From UNL Wiki
(Difference between revisions)
(→Instructions) |
|||
Line 51: | Line 51: | ||
;Provide as many UW's as necessary to each lemma, but do not include very rare or unusual cases. And check the order: the most likely senses must appear first. | ;Provide as many UW's as necessary to each lemma, but do not include very rare or unusual cases. And check the order: the most likely senses must appear first. | ||
;Base Form | ;Base Form | ||
− | : | + | :The Base Form must be the same as the lemma. |
;Inflection | ;Inflection | ||
− | :Select AND TEST the inflectional paradigm that generates the inflections of the base form. Any errors here will be propagated to the dictionary, so be careful. | + | :Select AND TEST the inflectional paradigm that generates the inflections of the base form. Any errors here will be propagated to the dictionary, so be careful. If there is no inflection paradigm to generate the desired inflected forms. |
− | + | ||
− | + | ||
− | + | ||
;Subcategorization | ;Subcategorization | ||
− | :Subcategorization is only required when the word REQUIRES a complement or a specifier (indirect transitive verbs that select an specific preposition, for instance). In this case, you have to inform the corresponding subcategorization frame. If the subcategorization frame is not available, | + | :Subcategorization is only required when the word REQUIRES a complement or a specifier (indirect transitive verbs that select an specific preposition, for instance). In this case, you have to inform the corresponding subcategorization frame. If the subcategorization frame is not available, select |
Revision as of 14:55, 27 March 2014
The project FRIDA (Français, Rumantsch, Italiano and Deutsch for Analysis) is devoted to the creation of NL-UNL (analysis) dictionaries for the official languages of Switzerland.
Contents |
Goal
The project FRIDA has two main goals:
- To provide several word-to-concept monolingual databases (i.e., encoding or reader's dictionaries). These dictionaries are expected to be used in UNLization, i.e., in generating UNL graphs out of natural language documents, especially through IAN.
- To find concepts that are not enclosed in the WordNet3.0 and should be incorporated to the UNL Dictionary.
Repository
The whole FRIDA contains, for each language, 30,000 lemmas, and is divided into 6 different repositories according to the frequency of use of lemmas.
- FRIDA-A1 contains the list of the 5,000 most frequent lemmas of the language (including articles, prepositions, conjunctions, auxiliary verbs, etc.);
- FRIDA-A2 contains the list of the following 5,000 most frequent lemmas of the language (including articles, prepositions, conjunctions, auxiliary verbs, etc.);
And so on, up to FRIDA-C2, according to the table below.
Repository | # of lemmas |
---|---|
FRIDA-A1 | 5,000 |
FRIDA-A2 | 5,000 |
FRIDA-B1 | 5,000 |
FRIDA-B2 | 5,000 |
FRIDA-C1 | 5,000 |
FRIDA-C2 | 5,000 |
Participants
- Anne-Sophie Gloor (French, Université de Genève)
- Anton Maria Prati (Italian, Università Ca' Foscari Venezia)
- Frank Brockmeier (German, Université de Lausanne)
- Katrin Renkwitz (German, Rheinische-Friedrich-Wilhelms-Universität Bonn)
- Myriam Hilout (French, Université Paris XIII)
- Monica Gallo (Italian, Université de Genève)
Instructions
- Lexical Category
- Whenever the lexical category for a given lemma is provided, check whether it is correct. If it is not correct, decline the entry and report the problem by clicking over the yellow triangle at the right of the main entry. If the lexical category is not provided, select the most likely category. Do not worry about homonyms: provide one single category for a given main entry.
- Lemma
- Do not change the lemma. If it is not correct (i.e., if it is misspelled or cannot be considered to be a lexical unit), decline the entry and report the problem by clicking over the yellow triangle at the right of the main entry.
- Provide as many UW's as necessary to each lemma, but do not include very rare or unusual cases. And check the order
- the most likely senses must appear first.
- Base Form
- The Base Form must be the same as the lemma.
- Inflection
- Select AND TEST the inflectional paradigm that generates the inflections of the base form. Any errors here will be propagated to the dictionary, so be careful. If there is no inflection paradigm to generate the desired inflected forms.
- Subcategorization
- Subcategorization is only required when the word REQUIRES a complement or a specifier (indirect transitive verbs that select an specific preposition, for instance). In this case, you have to inform the corresponding subcategorization frame. If the subcategorization frame is not available, select