martins Site Admin
Joined: 16 Dec 2009 Posts: 1481 Location: Geneva, Switzerland
|
Posted: Thu Apr 11, 2013 12:47 am Post subject: UNLarium update |
|
|
Dear All,
This is to inform that we finished today a major update to the
UNLarium. In addition to several bug fixes and some new features, the
following has been changed:
*Users may now have only 10% of entries with problems. Users who
exceed this limit will be blocked until they correct the problems
detected. The previous limit was of 20%.
*The UNL>NL (Generation) Dictionary will now accept a maximum of 5
(five) lemmas per UW. Some users have been introducing an exagerated
number of natural language candidates for the same concept, which
causes many problems in natural language generation. The full
vocabulary of a language is expected to be addressed in the NL>UNL
(Analysis) Dictionary (such as in the project BRUNO), and not in the
Generation Dictionary (project MIR, for instance).
*Multiword expressions will now be systematically checked against
their frequency. We have been receiving many multiword expressions
that are not really lexical units, in the sense that they cannot be
found as entries or sub-entries in monolingual dictionaries. If a
concept (UW) is not lexicalized in your language, and may only be
referred to by an expression that has not been included in ordinary
dictionaries yet, the problem must be reported. One of the goals of
the Generation Dictionary is exactly to investigate the extent to
which concepts are lexicalized. In order to decide whether a multiword
expression is a lexical unit, observe the frequency (it should have at
least 100,000 occurrences in Google), the invariance (lexical units
are normally frozen) and the compositionality (lexical units are
normally not compositional, i.e., their meaning is different from the
sum of their parts). For further information, please refer to
www.unlweb.net/wiki/LRU.
We recall that the best procedure is always to decline entries in case
of doubts or to postpone them and ask for advice. And we would ask
users working with NL Dictionaries to pay attention to the structure
of lemmas and to the their mappings to UW's. Do not propagate errors
by validating lemmas that are not lexical units or that contain wrong
mappings, because these errors will also affect the reviewers' score.
Best regards,
--
---------------------------------------------
Ronaldo MARTINS
Language Resources Manager
UNDL Foundation
48, route de Chancy
CH-1213 - Geneva - Switzerland
+41 22 879 8090
www.undlfoundation.org
---------------------------------------------
This is a post-only mailing from the UNLweb. Please do not reply to
this message. If you have any questions or comments, please contact us
via email at info@unlweb.net. If, for any reason, you do not want to
receive messages from the UNLweb, please deactivate your account at
www.unlweb.net. |
|