| » ASP Competition | |
|
» Login
|
Grammar-Based Information ExtractionProblem descriptionRecognizing and extracting meaningful information from unstructured Web documents is an important problem in information and knowledge management.
In the recent literature a number of approaches for Information Extraction from unstructured documents have been proposed.
Among them, the HiLeX system [Ruffolo et al JELIA'06], based on ASP, is receiving quite some attention also in industry, and its industrial exploitation is the purpose of a joint-venture between an Italian and US company from Chicago. Problem specificationWe are given the following context free grammar, which specifies arithmetic equations: Given an equation (sequence of terminals), the problem is determining whether it belongs to the language defined by this grammar (starting from <equation>) and if so, whether the value associated to the axiom is "true" proving that the equation is true as well. Input formatAn instance of this problem contains a fact (of arity two) for each character of the equation. The first argument is a constant from the set {p, m, g, o, c, 0, 1, ..., 9} representing (with a positional mapping) a terminal of the alphabet {'+', '-', '>', '(', ')', '0', '1', ..., '9'}, and the second one is its position (starting from 1). The instance "9-(3+(4-2)+1)>-1" from above (of length 16) would be represented by the following atoms:
Output formatIf the string described by the input does not belong to the language (is not a well-formed expression), or if it is a well-formed but false expression, then the instance should be unsatisfiable. If the string is well-formed and true, the witness should consist of:sol. values(S1,N1,S2,N2). Here the value of two expressions are represented by (S1,N1), (S2,N2), where Si is a sign (p or m, standing for {'+','-'}) and Ni a positive integer in the set {0,1,2,...}. Number Zero is represented as (p,0). Of course, the number represented by (S1,N1) should be larger than (S2,N2). For instance, in the above example, as the value of "9-(3+(4-2)+1)" is 3, then the answer set has to contain the atoms: sol. values(p,3,m,1). Authors: Marco Manna
|