F-measure

From UNL Wiki
(Difference between revisions)
Jump to: navigation, search
Line 24: Line 24:
 
***missing_relations is the number of relations absent from the actual output but present in the expected output
 
***missing_relations is the number of relations absent from the actual output but present in the expected output
 
***total_relations is the sum of the total number of relations in the actual output and in the expected output
 
***total_relations is the sum of the total number of relations in the actual output and in the expected output
***exceding_UW is the number of UW's* present in the actual output but absent from the expected output
+
***exceding_UW is the number of UW's<ref>For the sake of comparison, a source UW is considered to be different from the target UW. Scopes are ignored.</ref> present in the actual output but absent from the expected output
 
***missing_UW is the number of UW's* absent from the actual output but present in the expected output
 
***missing_UW is the number of UW's* absent from the actual output but present in the expected output
 
***total_UW is the sum of the total number of UW's* in the actual output and in the expected output
 
***total_UW is the sum of the total number of UW's* in the actual output and in the expected output
***exceding_attribute is the number of attributes** present in the actual output but absent from the expected output
+
***exceding_attribute is the number of attributes<ref>For the sake of comparison, a source attribute is considered to be different from the target attribute. Optional attributes (such as .@def and .@indef) are ignored.</ref> present in the actual output but absent from the expected output
 
***missing_attribute is the number of attributes** absent from the actual output but present in the expected output
 
***missing_attribute is the number of attributes** absent from the actual output but present in the expected output
 
***total_attribute is the sum of the total number of attributes** in the actual output and in the expected output
 
***total_attribute is the sum of the total number of attributes** in the actual output and in the expected output
***<nowiki>*</nowiki>For the sake of comparison, a source UW is considered to be different from the target UW. Scopes are ignored.
 
***<nowiki>**</nowiki>For the sake of comparison, a source attribute is considered to be different from the target attribute. Optional attributes (such as .@def and .@indef) are ignored.
 
 
A result is considered '''"correct"''' when the Levensthein distance between the actual result and the expected result was less than 30% of the length of the expected result. The Levenshtein distance is defined as the minimal number of characters you have to replace, insert or delete to transform a string (the actual output) into another one (the expected output).
 
A result is considered '''"correct"''' when the Levensthein distance between the actual result and the expected result was less than 30% of the length of the expected result. The Levenshtein distance is defined as the minimal number of characters you have to replace, insert or delete to transform a string (the actual output) into another one (the expected output).
 +
== References ==
 +
<references />

Revision as of 23:01, 1 April 2013

In the UNL System, the F-measure (or F1-score) is the measure of a grammar's accuracy. It considers both the precision and the recall of the grammar to compute the score, according to the formula

F-measure = 2 x ( (precision x recall) / (precision + recall) )

In the above:

  • precision is the number of correct results divided by the number of all returned results
  • recall is the number of correct results divided by the number of results that should have been returned

A result is considered "RETURNED" in the following cases:

  • In UNLization, when the output is a graph (i.e., all the nodes are interlinked) made only of UW's (i.e., without natural language words)
  • In NLization, when the output is a list of natural language words (i.e., without any UW).

A result is considered "CORRECT" in the following cases:

  • In UNLization, when
    • The discrepancy of relations between the actual and the expected output is less than 0.3; AND
    • The discrepancy of UW's between the actual and the expected output is less than 0.3; AND
    • The overall discrepancy is less than 0.5, WHERE
      • Discrepancy of relations is calculated by the formula:
        • (exceding_relations + missing_relations)/total_relations
      • Discrepancy of UW's is calculated by the formula:
        • (exceding_UW + missing_UW)/total_UW
      • Overall discrepancy is calculated by the formula:
        • ((3*(exceding_relations+missing_relations))+(2*(exceding_UW+missing_UW)+(exceding_attribute+missing_attribute))/((3*total_relations)+(2*total_UW)+(total_attribute))
    • WHERE
      • exceding_relations is the number of relations present in the actual output but absent from the expected output
      • missing_relations is the number of relations absent from the actual output but present in the expected output
      • total_relations is the sum of the total number of relations in the actual output and in the expected output
      • exceding_UW is the number of UW's[1] present in the actual output but absent from the expected output
      • missing_UW is the number of UW's* absent from the actual output but present in the expected output
      • total_UW is the sum of the total number of UW's* in the actual output and in the expected output
      • exceding_attribute is the number of attributes[2] present in the actual output but absent from the expected output
      • missing_attribute is the number of attributes** absent from the actual output but present in the expected output
      • total_attribute is the sum of the total number of attributes** in the actual output and in the expected output

A result is considered "correct" when the Levensthein distance between the actual result and the expected result was less than 30% of the length of the expected result. The Levenshtein distance is defined as the minimal number of characters you have to replace, insert or delete to transform a string (the actual output) into another one (the expected output).

References

  1. For the sake of comparison, a source UW is considered to be different from the target UW. Scopes are ignored.
  2. For the sake of comparison, a source attribute is considered to be different from the target attribute. Optional attributes (such as .@def and .@indef) are ignored.
Software