F-measure
From UNL Wiki
(Difference between revisions)
(→How to calculate the F-Measure) |
|||
(26 intermediate revisions by one user not shown) | |||
Line 6: | Line 6: | ||
*'''precision''' is the number of correct results divided by the number of all returned results | *'''precision''' is the number of correct results divided by the number of all returned results | ||
*'''recall''' is the number of correct results divided by the number of results that should have been returned | *'''recall''' is the number of correct results divided by the number of results that should have been returned | ||
− | A result is considered '''"RETURNED"''' | + | |
− | * | + | == [[UNLization]] == |
− | + | ;RETURNED | |
− | A result is considered '''"CORRECT"''' | + | A result is considered '''"RETURNED"''' when: |
− | + | *the output is a graph (i.e., all the nodes are interlinked); AND | |
− | + | *the output is made only of UW's (i.e., without natural language words) | |
− | + | ;CORRECT | |
− | + | A result is considered '''"CORRECT"''' when: | |
− | + | *The discrepancy of relations between the actual and the expected output is less than 0.3; AND | |
− | ( | + | *The discrepancy of UW's between the actual and the expected output is less than 0.3; AND |
− | + | *The overall discrepancy is less than 0.3. | |
− | ( | + | WHERE<br /> |
− | + | :Discrepancy of relations is calculated by the formula: | |
− | ((3*( | + | (exceeding_relations + missing_relations)/total_relations |
− | * | + | :Discrepancy of UW's is calculated by the formula: |
− | + | (exceeding_UW + missing_UW)/total_UW | |
− | + | :Overall discrepancy is calculated by the formula: | |
− | + | ((3*(exceeding_relations+missing_relations))+(2*(exceding_UW+missing_UW)+(exceding_attribute+missing_attribute))/((3*total_relations)+(2*total_UW)+(total_attribute)) | |
− | * | + | :*exceeding_relations is the number of relations present in the actual output but absent from the expected output |
− | + | :*missing_relations is the number of relations absent from the actual output but present in the expected output | |
− | + | :*total_relations is the sum of the total number of relations in the actual output and in the expected output | |
− | * | + | :*exceeding_UW is the number of UW's<ref name="a">For the sake of comparison, UW's appearing in the source position are considered to be different from UW's appearing in the target position of a relation. Scopes are ignored.</ref> present in the actual output but absent from the expected output |
− | + | :*missing_UW is the number of UW's<ref name="a"/> absent from the actual output but present in the expected output | |
− | + | :*total_UW is the sum of the total number of UW's<ref name="a"/> in the actual output and in the expected output | |
− | + | :*exceeding_attribute is the number of attributes<ref name="b">For the sake of comparison, attributes appearing in the source position are considered to be different from attributes appearing in the target position of a relation. Optional attributes (such as .@def and .@indef) are ignored.</ref> present in the actual output but absent from the expected output | |
− | + | :*missing_attribute is the number of attributes<ref name="b"/> absent from the actual output but present in the expected output | |
− | ( | + | :*total_attribute is the sum of the total number of attributes<ref name="b"/> in the actual output and in the expected output |
− | + | ||
− | * | + | == [[NLization]] == |
− | + | ;RETURNED | |
− | + | A result is considered '''"RETURNED"''' when the output is a list of natural language words (i.e., without any UW). | |
+ | ;CORRECT | ||
+ | A result is considered '''"CORRECT"''' when the difference between the actual result and the expected result is less than 0.3<br /> | ||
+ | :The difference between the actual and the expected result is calculated by the formula | ||
+ | (exceeding_words+missing_words)/(total_words) | ||
+ | WHERE | ||
+ | :*exceeding_words is the number of words present in the actual output but absent from the expected output | ||
+ | :*missing_words is the number of words absent from the actual output but present in the expected output | ||
+ | :*total_words is the sum of the total number of words in the actual output and in the expected output | ||
+ | |||
+ | == How to calculate the F-Measure == | ||
+ | The F-measure may be automatically calculated at UNLWEB>UNLARIUM>TOOLS>F-MEASURE.<br /> | ||
+ | In order to calculate the F-Measure, you have to provide the following: | ||
+ | *For UNLization | ||
+ | **The actual UNL output, automatically generated by [[IAN]] with your grammars and dictionaries, at the trace level NONE<ref name="range">Select the option RANGE, with all sentences, in order to generate the output for all sentences at once</ref> | ||
+ | **The expected UNL output for the same set of sentences | ||
+ | *For NLization | ||
+ | **The actual NL output, automatically generated by [[EUGENE]] with your grammars and dictionaries, at the trace level MINIMAL<ref name="range"/>. | ||
+ | **The expected NL output for the same set of sentences | ||
+ | |||
== References == | == References == | ||
<references /> | <references /> |
Latest revision as of 01:23, 18 February 2014
In the UNL System, the F-measure (or F1-score) is the measure of a grammar's accuracy. It considers both the precision and the recall of the grammar to compute the score, according to the formula
F-measure = 2 x ( (precision x recall) / (precision + recall) )
In the above:
- precision is the number of correct results divided by the number of all returned results
- recall is the number of correct results divided by the number of results that should have been returned
Contents |
UNLization
- RETURNED
A result is considered "RETURNED" when:
- the output is a graph (i.e., all the nodes are interlinked); AND
- the output is made only of UW's (i.e., without natural language words)
- CORRECT
A result is considered "CORRECT" when:
- The discrepancy of relations between the actual and the expected output is less than 0.3; AND
- The discrepancy of UW's between the actual and the expected output is less than 0.3; AND
- The overall discrepancy is less than 0.3.
WHERE
- Discrepancy of relations is calculated by the formula:
(exceeding_relations + missing_relations)/total_relations
- Discrepancy of UW's is calculated by the formula:
(exceeding_UW + missing_UW)/total_UW
- Overall discrepancy is calculated by the formula:
((3*(exceeding_relations+missing_relations))+(2*(exceding_UW+missing_UW)+(exceding_attribute+missing_attribute))/((3*total_relations)+(2*total_UW)+(total_attribute))
- exceeding_relations is the number of relations present in the actual output but absent from the expected output
- missing_relations is the number of relations absent from the actual output but present in the expected output
- total_relations is the sum of the total number of relations in the actual output and in the expected output
- exceeding_UW is the number of UW's[1] present in the actual output but absent from the expected output
- missing_UW is the number of UW's[1] absent from the actual output but present in the expected output
- total_UW is the sum of the total number of UW's[1] in the actual output and in the expected output
- exceeding_attribute is the number of attributes[2] present in the actual output but absent from the expected output
- missing_attribute is the number of attributes[2] absent from the actual output but present in the expected output
- total_attribute is the sum of the total number of attributes[2] in the actual output and in the expected output
NLization
- RETURNED
A result is considered "RETURNED" when the output is a list of natural language words (i.e., without any UW).
- CORRECT
A result is considered "CORRECT" when the difference between the actual result and the expected result is less than 0.3
- The difference between the actual and the expected result is calculated by the formula
(exceeding_words+missing_words)/(total_words)
WHERE
- exceeding_words is the number of words present in the actual output but absent from the expected output
- missing_words is the number of words absent from the actual output but present in the expected output
- total_words is the sum of the total number of words in the actual output and in the expected output
How to calculate the F-Measure
The F-measure may be automatically calculated at UNLWEB>UNLARIUM>TOOLS>F-MEASURE.
In order to calculate the F-Measure, you have to provide the following:
- For UNLization
- For NLization
References
- ↑ 1.0 1.1 1.2 For the sake of comparison, UW's appearing in the source position are considered to be different from UW's appearing in the target position of a relation. Scopes are ignored.
- ↑ 2.0 2.1 2.2 For the sake of comparison, attributes appearing in the source position are considered to be different from attributes appearing in the target position of a relation. Optional attributes (such as .@def and .@indef) are ignored.
- ↑ 3.0 3.1 Select the option RANGE, with all sentences, in order to generate the output for all sentences at once