Issues

From UNL Wiki

(Difference between revisions)

Latest revision as of 16:55, 3 February 2014

List of pending features and known bugs.

IAN and EUGENE (VERSION 1.1)

Back-end

Parsing
Parsing of rules need to be improved. IAN and EUGENE were accepting rules with unbalanced parentheses. There is also a problem of an extra comma in the rules. The sensitivity of syntactic check of the Engines should be higher. Eugene and IAN must be sensitive to the following syntactic error:
(%a,A,B,C):=((%a,+E); (This rule is being accepted by the system)

Encoding
Eugene and IAN should reject wrong UTF-8 encoding. From the perspective of the user, the rule was perfect, and the string was clearly and correctly displayed; but the machine was replacing it by empty.
Consistency of graphs
Rules leading to impossible graphs are working. The example below is generating an impossible graph.
(NB(N,%n;JB(%j;%j2),{and|or},%adjc),%m):= (JB(%j;%j2),rel=%adjc) (NB(N,%n;%j),rel=%m)(NB(N,%n;%j2),rel=%m);
This rule is putting the same node %j in two different positions in the node list. This should not be possible. A node cannot be inside two different nodes in a list structure.
Preprocessing module
A module for preprocessing is needed in IAN. It will serve for sentence segmentation and morphological preprocessing. Rules of the preprocessing module will be only of the LL type, will only deal with strings and will apply before any dictionary search. They will be used to assign STAIL and SHEAD. Regular expressions should be admitted. The unit of processing will be the paragraph (i.e., any string between \n and \r). Examples of possible rules:
(" .",%x):=(%x)(+STAIL,%y);
(".",%x)(/[ABCDEFGHIJKLMNOPQRSTUVWXYZ]/,%y):=(%z,+SHEAD)(%x)(%y);
("an ",%x)(/[aeiouy]/,%y):=("a ",%x)(%y);

Observations:
+STAIL automatically creates SHEAD (in addition to STAIL itself), and +SHEAD automatically create STAIL.
The preprocessing module should be provided in a separate tab (S-Rules, for segmentation rules)

Mathematical operations (574)
Mathematical operations inside nodes

(%x):=(%x-1); (i.e., reduce the value of %x in 1)
(%x):=(%x+1); (i.e., add 1 to %x)
(%x):=(%x*2); (i.e., multiply %x by 2)
(%x):=(%x/2); (i.e., divide % by 2)

Indexation of relations (postponed)

Relations should admit an index, as nodes. This would avoid ambiguity when dealing with relations in different scopes:

XB:%a(%x;%y)XB:%b(%x;%z):=XB:%a(XB(%x;%y);%z); the relation XB(%x;%y) will be created as a scope inside %a

XB:%a(%x;%y)XB:%b(%x;%z):=XB:%b(XB(%x;%y);%z); the relation XB(%x;%y) will be created as a scope inside %b

In any case, the indexation should comply with a possible graph structure

Discontinuous multiword expressions (706)
Headwords, UWs and strings used as values of attributes:

(%x,ATTRIBUTE=[%y])(%y):=ACTION; the system checks whether the value of the attribute ATTRIBUTE is the HEADWORD of %y
(%x,ATTRIBUTE=[[%y]])(%y):=ACTION; the system checks whether the value of the attribute ATTRIBUTE is the UW of %y
(%x,ATTRIBUTE="%y")(%y):=ACTION; the system checks whether the value of the attribute ATTRIBUTE is the STRING of %y
CONDITION:=(%x,ATTRIBUTE=[%y])(%y): the system assigns the attribute ATTRIBUTE to %x with the value of the HEADWORD of %y
CONDITION:=(%x,ATTRIBUTE=[[%y]])(%y): the system assigns the attribute ATTRIBUTE to %x with the value of the UW of %y
CONDITION:=(%x,ATTRIBUTE="%y")(%y): the system assigns the attribute ATTRIBUTE to %x with the value of the STRING of %y

Rules with discontinuous nodes

(%x)(ANY SEQUENCE OF NODES, %z)(%y):=(%x)(%y)(%z);

Front-end

Drag-and-drop
To include the possibility of using "drag-and-drop" to reorder dictionaries and dictionary entries, and grammars and grammar rules (in addition to the current one);

Test sets: To improve the test sets. They should show only the differences. And the results should be exportable and importable.
Trace: The trace must be thoroughly revised. The desired structure is presented at [1]

Groups
Groups should be collapsible/expandable, and a single file may participate in several groups (grouping must be done using tags, instead of exclusive categories)

Shared resources: Shared resources must bring the possibility of being reordered (currently, we cannot reorder them)
NL and UNL documents: Shared NL inputs (currently, it's only possible to send them, but then the changes are not propagated). And they should work as dictionaries and grammars (we should have the option of grouping them and loading more than one at a time)
IAN/EUGENE communication: A given output of IAN could be used as the input for EUGENE and vice-versa - using the loaded resources

Update
Dictionary and grammar update should replace the current files instead of adding the resources to the end of the existing files
Range
The trace level of the option "range" should be defined by user. It's OK to use NONE as default, but the user could also have more detailed results for more than one sentence.

IAN and EUGENE (VERSION 1.2)

1. To include the option UNDO for the deletion of files and entries
2. Selecting a file should be the same as loading it (changed to: indicating clearly that a file has been loaded)
3. ~~The range interval should be also user-defined. For the time being, it's only possible to select the interval from the drop-down list.~~
4. Users should have the possibility of uploading more than one file at once in a single .zip file
5. Users should have the possibility of visualizing the output of IAN as a graph
6. Backtracking (top-down approach)

IAN and EUGENE 2.0

1. SDK
2. Stand-alone version of IAN and EUGENE

LILY (VERSION 1.1)

1. Localization of the interface should be done through uploading a localization file (directly by admin).
2. Include LILY in the UNLdev. The user should have the option of seeing the results of Lily for his/her own data.
3. Alternative translations. The user should have the option of selecting other possible results according to the grammar.
4. Mobile (app) version.

KEYS (VERSION 1.0)

1. Graphic output (as fancy as possible and with support for touch screen).
2. Localizable interface.
3. Another design for the interface (cleaner and simpler).
3. Mobile (app) version.
4. Integration with EUGENE.

UNL Tool Kit (VERSION BETA)

1. Corpus processing: given a set of documents, the system should clean it (from html tags, for instance), segment it (according to the a user-defined set of symbols), tokenize it (according to the dictionary), extract the word list (with frequency of occurrence), lemmatize it (according to the dictionary), POS tag it (according to the dictionary) and extract the POS patterns (with the frequency of occurrence). The system should also include search facilities (concordance).
2. Dictionary builder: given a word list, the system should lemmatize it (according to the dictionary) and POS tag it.
3. Grammar builder: given a set of POS tagged sentences, the system should build the corresponding trees in order to form a tree-bank (by hand, i.e., through a tree-builder user-friendly interface, or automatically, using a grammar provided according to the Grammar Specs). The tree-bank will be used to induce a grammar (reverse engineering).
4. Graph builder: given a set of trees, the system should build the corresponding graphs in order to form a graph-bank (by hand, i.e., through a graph-builder user-friendly interface) or automatically (using a grammar provided according to the Grammar Specs). The graph-bank will be used to induce a grammar (reverse engineering).

@@ Line 1: / Line 1: @@
 List of pending features and known bugs.
-== IAN and EUGENE ==
+== IAN and EUGENE (VERSION 1.1) ==
 === Back-end ===
-. (1.0) Parsing of rules need to be improved. IAN and EUGENE were accepting rules with unbalanced parentheses. There is also a problem of an extra comma in the rules. The sensitivity of syntactic check of the Engines should be higher. Eugene and IAN must be sensitive to the following syntactic error:
+<strike>
+;Parsing
+:Parsing of rules need to be improved. IAN and EUGENE were accepting rules with unbalanced parentheses. There is also a problem of an extra comma in the rules. The sensitivity of syntactic check of the Engines should be higher. Eugene and IAN must be sensitive to the following syntactic error:
+:*(%a,A,B,C):=((%a,+E); (This rule is being accepted by the system)
+;Encoding
+:Eugene and IAN should reject wrong UTF-8 encoding. From the perspective of the user, the rule was perfect, and the string was clearly and correctly displayed; but the machine was replacing it by empty.
+;Consistency of graphs
+:Rules leading to impossible graphs are working. The example below is generating an impossible graph.
+:(NB(N,%n;JB(%j;%j2),{and|or},%adjc),%m):= (JB(%j;%j2),rel=%adjc) (NB(N,%n;%j),rel=%m)(NB(N,%n;%j2),rel=%m);
+:This rule is putting the same node %j in two different positions in the node list. This should not be possible. A node cannot be inside two different nodes in a list structure.''
+;Preprocessing module
+:A module for preprocessing is needed in IAN. It will serve for sentence segmentation and morphological preprocessing. Rules of the preprocessing module will be only of the LL type, will only deal with strings and will apply before any dictionary search. They will be used to assign STAIL and SHEAD. Regular expressions should be admitted. The unit of processing will be the paragraph (i.e., any string between \n and \r). Examples of possible rules:
+:*<nowiki>(" .",%x):=(%x)(+STAIL,%y);</nowiki>
+:*<nowiki>(".",%x)(/[ABCDEFGHIJKLMNOPQRSTUVWXYZ]/,%y):=(%z,+SHEAD)(%x)(%y);</nowiki>
+:*<nowiki>("an ",%x)(/[aeiouy]/,%y):=("a ",%x)(%y);</nowiki>
+:Observations:
+:*+STAIL automatically creates SHEAD (in addition to STAIL itself), and +SHEAD automatically create STAIL.
+:*The preprocessing module should be provided in a separate tab (S-Rules, for segmentation rules)
+;Mathematical operations (574)
+:Mathematical operations inside nodes <br />
+:*<nowiki>(%x):=(%x-1);</nowiki> (i.e., reduce the value of %x in 1)
+:*<nowiki>(%x):=(%x+1);</nowiki> (i.e., add 1 to %x)
+:*<nowiki>(%x):=(%x*2);</nowiki> (i.e., multiply %x by 2)
+:*<nowiki>(%x):=(%x/2); (</nowiki>i.e., divide % by 2)
+</strike>
+;Indexation of relations (postponed)
+:Relations should admit an index, as nodes. This would avoid ambiguity when dealing with relations in different scopes:
+::XB:%a(%x;%y)XB:%b(%x;%z):=XB:%a(XB(%x;%y);%z); the relation XB(%x;%y) will be created as a scope inside %a
+::XB:%a(%x;%y)XB:%b(%x;%z):=XB:%b(XB(%x;%y);%z); the relation XB(%x;%y) will be created as a scope inside %b
+:In any case, the indexation should comply with a possible graph structure
+<strike>
+;Discontinuous multiword expressions (706)
+:Headwords, UWs and strings used as values of attributes:
+*(%x,ATTRIBUTE=[%y])(%y):=ACTION; the system checks whether the value of the attribute ATTRIBUTE is the HEADWORD of %y
+*(%x,ATTRIBUTE=<nowiki>[[%y]]</nowiki>)(%y):=ACTION; the system checks whether the value of the attribute ATTRIBUTE is the UW of %y
+*(%x,ATTRIBUTE="%y")(%y):=ACTION; the system checks whether the value of the attribute ATTRIBUTE is the STRING of %y
+*CONDITION:=(%x,ATTRIBUTE=[%y])(%y): the system assigns the attribute ATTRIBUTE to %x with the value of the HEADWORD of %y
+*CONDITION:=(%x,ATTRIBUTE=<nowiki>[[%y]]</nowiki>)(%y): the system assigns the attribute ATTRIBUTE to %x with the value of the UW of %y
+*CONDITION:=(%x,ATTRIBUTE="%y")(%y): the system assigns the attribute ATTRIBUTE to %x with the value of the STRING of %y
+:Rules with discontinuous nodes
+*(%x)(ANY SEQUENCE OF NODES, %z)(%y):=(%x)(%y)(%z);
+</strike>
-''(%a,A,,B,C):=(%a,+E);''
+===Front-end===
+<strike>
+;Drag-and-drop
+:To include the possibility of using "drag-and-drop" to reorder dictionaries and dictionary entries, and grammars and grammar rules (in addition to the current one);
+</strike>
+;Test sets
+:To improve the test sets. They should show only the differences. And the results should be exportable and importable.
+;Trace
+:The trace must be thoroughly revised. The desired structure is presented at [http://www.unlweb.net/forum/viewtopic.php?t=575]
+<strike>
+;Groups
+:Groups should be collapsible/expandable, and a single file may participate in several groups (grouping must be done using tags, instead of exclusive categories)
+</strike>
+;Shared resources
+:Shared resources must bring the possibility of being reordered (currently, we cannot reorder them)
+;NL and UNL documents
+:Shared NL inputs (currently, it's only possible to send them, but then the changes are not propagated). And they should work as dictionaries and grammars (we should have the option of grouping them and loading more than one at a time)
+;IAN/EUGENE communication
+:A given output of IAN could be used as the input for EUGENE and vice-versa - using the loaded resources
+<strike>
+;Update
+:Dictionary and grammar update should replace the current files instead of adding the resources to the end of the existing files
+;Range
+:The trace level of the option "range" should be defined by user. It's OK to use NONE as default, but the user could also have more detailed results for more than one sentence.
+</strike>
-. (1.0) Eugene and IAN should reject the wrong UTF-8 encoding. From the perspective of the user, the rule was perfect, and the string was clearly and correctly displayed; but the machine was replacing it by empty.
+== IAN and EUGENE (VERSION 1.2) ==
+. To include the option UNDO for the deletion of files and entries<br />
+. Selecting a file should be the same as loading it (changed to: indicating clearly that a file has been loaded)<br />
+. <strike>The range interval should be also user-defined. For the time being, it's only possible to select the interval from the drop-down list.</strike><br />
+. Users should have the possibility of uploading more than one file at once in a single .zip file<br />
+. Users should have the possibility of visualizing the output of IAN as a graph<br />
+. Backtracking (top-down approach)<br />
-. (1.1) The rules leading to impossible graphs are working. The example below is generating an impossible graph.
+== IAN and EUGENE 2.0 ==
+. SDK<br />
+. Stand-alone version of IAN and EUGENE<br />
-''The rule
+== LILY (VERSION 1.1) ==
-(NB(N,%n;JB(%j;%j2),{and|or},%adjc),%m):= (JB(%j;%j2),rel=%adjc) (NB(N,%n;%j),rel=%m)(NB(N,%n;%j2),rel=%m);
+. Localization of the interface should be done through uploading a localization file (directly by admin).<br />
+. Include LILY in the UNLdev. The user should have the option of seeing the results of Lily for his/her own data.<br />
+. Alternative translations. The user should have the option of selecting other possible results according to the grammar.<br />
+. Mobile (app) version.<br />
-And it's putting the same node %j in two different positions in the node list. This should not be possible. A node cannot be inside two different nodes in a list structure.''
+== KEYS (VERSION 1.0) ==
+. Graphic output (as fancy as possible and with support for touch screen).<br />
+. Localizable interface.<br />
+. Another design for the interface (cleaner and simpler).<br />
+. Mobile (app) version.<br />
+. Integration with EUGENE.<br />
-. (1.0) In Eugene, after range, if we try to process another sentence, we get the message “Another process is still running. Please stop it first”. But the process is already over, and there is no button to stop it anyway. We have to reload the whole page in order for the system to work.
+== UNL Tool Kit (VERSION BETA) ==
+. Corpus processing: given a set of documents, the system should clean it (from html tags, for instance), segment it (according to the a user-defined set of symbols), tokenize it (according to the dictionary), extract the word list (with frequency of occurrence), lemmatize it (according to the dictionary), POS tag it (according to the dictionary) and extract the POS patterns (with the frequency of occurrence). The system should also include search facilities (concordance).<br />
-. (1.1) A module for preprocessing is needed in IAN. It will serve for sentence segmentation and morphological preprocessing.
+. Dictionary builder: given a word list, the system should lemmatize it (according to the dictionary) and POS tag it.<br />
+. Grammar builder: given a set of POS tagged sentences, the system should build the corresponding trees in order to form a tree-bank (by hand, i.e., through a tree-builder user-friendly interface, or automatically, using a grammar provided according to the Grammar Specs). The tree-bank will be used to induce a grammar (reverse engineering).<br />
-. (1.1) Mathematical operations inside nodes <br />
+. Graph builder: given a set of trees, the system should build the corresponding graphs in order to form a graph-bank (by hand, i.e., through a graph-builder user-friendly interface) or automatically (using a grammar provided according to the Grammar Specs). The graph-bank will be used to induce a grammar (reverse engineering).<br />
-*<nowiki>(%x):=(%x-1);</nowiki> (i.e., reduce the value of %x in 1)
-*<nowiki>(%x):=(%x+1);</nowiki> (i.e., add 1 to %x)
-*<nowiki>(%x):=(%x*2);</nowiki> (i.e., multiply %x by 2)
-*<nowiki>(%x):=(%x/2); (</nowiki>i.e., divide % by 2)
-=== Front-end ===
-. (1.1) To replace the "Welcome" tab in EUGENE and IAN. It is more interesting to have any instructions informed in a help button.
-. (1.0) To create the "text editor" view for dictionary entries and grammar rules (i.e., the user should have the opportunity to treat these files as when they are introduced in the manual input);
-. (1.0) After editing a given dictionary entry or grammar rule, the system should remain in the same page (it currently goes back to the first page);
-. (1.1) To include the possibility of using "drag-and-drop" to reorder dictionary entries and grammar rules (in addition to the current one);
-. (1.0) To create the possibility of having "test sets". The user should have the possibility of flagging a given sentence as correct (i.e., the grammar and dictionary would work for it). After that, the sentence would be automatically included in a "test set", to be defined by the user, and stored as a reference. The user would have the possibility of checking the consistency of the grammar and dictionary at any time by running this test set (but without the need to analyze the individual outputs for the sentences, because the machine would do that automatically, and would highlight all the sentences for which the results would be different than the ones defined as correct).
-. (1.0) The loaded unl sentence issue after logout/loging with different user. Sometimes a wrong unl document/sentence is loaded, when a user is changed within the same session.
-. (1.0) In the dictionary and rule editor. The checkbox at the left should be used to select entries or rules, and the operations (edit, delete, clone, disable/enable) should appear as individual options right after them (and not at the rightmost position in the line), or as multi-entry options at the end of the page. It is recomended to use the same approach and icons used, for instance, in other DBMS such as phpMyAdmin, SQLyog, OracleSQL and others. The thread [http://www.unlweb.net/forum/viewtopic.php?t=589] brings a screenshot of what to be the ideal interface for the editor.
-. (1.1) The trace must be revised to be more intuitive.
-. (1.0) The disambiguation rules should be enabled/disabled in one single place, instead of two. Now they can be turned on/off both in the EUGENE/IAN tabs by a checkbox and loaded/unloaded from the editor.
-. (1.0) The "create" button is confusing. Users are cloning rules instead of saving them. The same for the check box at the left of dictionary and grammar entries. People are using them to select rules for edition rather than for deletion.
-. (1.0) After cloning a rule or dictionary entry, the system goes back to page 1 (instead of remaining in the same page).
-. (1.0) It’s not possible to delete shared dictionaries (although it’s possible to delete shared rule-sets)
-. (1.0) People have been pressing the button “delete dictionary” instead of “delete selected entries”. We do need to put the buttons for files and entries in different places.
-. (1.0) To include a link to EUGENE in IAN's welcome tab, and link to IAN in EUGENE's welcome tab.
-. (1.0) To create the possibility to add more than one dictionary entry or more than one grammar rule at the same time;
-== LILY ==
-=== End-user interface ===
-# (BETA)To remove the option "compiled resources" from the interface. There will be an admin page where the configuration will be set.
-# (BETA) LILY is not accepting Arabic input.
-# (BETA)To replace the localization file (translations to be provided by the UNDL Foundation)
-# (BETA)To replace the logos of the UNDLF and UNL by others with higher-resolution (to be provided by the UNDL Foundation)
-# (BETA)To replace the copyright (to be provided by the UNDL Foundation)
-# (BETA)To remove the login in the end-user final version
-# (BETA)To replace the contact to info@undlfoundation.org
-# (BETA) Background images are not being aligned in zoom in and zoom out (CSS). The application is not working in IE.
-# (BETA)To reduce the size of the logos
-# (BETA)To synchronize users and passwords with the UNLweb
-# (1.0)Feedback from users (source, UNL and target must be stored with the corresponding evaluation)
-=== Test interface ===
-#To have IAN's and EUGENE's dictionary, t-rules and d-rules tab in the same interface inside the UNLdev.
-== NORMA ==
-== SEAN ==

Issues

Latest revision as of 16:55, 3 February 2014

Contents

IAN and EUGENE (VERSION 1.1)

Back-end

Front-end

IAN and EUGENE (VERSION 1.2)

IAN and EUGENE 2.0

LILY (VERSION 1.1)

KEYS (VERSION 1.0)

UNL Tool Kit (VERSION BETA)

Views

Personal tools

Search

UNL

Lingware

Software

UNL Program

Navigation

Toolbox

Print/export