Workshop on High-level Information Extraction

Information extraction (IE) techniques aim at extracting structured information from unstructured data sources. IE methods are successful at addressing naturally arising learning tasks where the data is generally structured, highly correlated, and frequently preserve multiple-way dependencies within and between recurrent structures.
By now, "low-level" tasks such as named entity recognition are well understood, however, solving complex IE tasks - like relation and event extraction - remains a challenge.
In the last years, significant contributions to high-level IE in relevant fields led to applications that have matured to a point beyond proof of concept. However, which strategy (e.g., pipeline, structured, or hybrid) is beneficial for which problems is not yet well understood, neither from the theoretical nor the practical point of view.
We aim at bringing together an interdisciplinary group of researchers who are working on high-level information extraction. The goal of this workshop will be to structure and explore the state of the art, to evolve high-level IE models with regard to real-world applications, and to identify future challenges and applications. We intend to cover a broad range of methods, including pipelined/hybrid approaches and structured prediction models; in particular we are interested in the following topics:

  • Algorithms:
    What are the differences between pipelined and structured methods? Are there hybrid methods, using the best of the two worlds? Are there novel algorithms and techniques for solving high-level IE or subproblems thereof?

  • Theoretical results:
    Are there convergence/generalization bounds for high-level IE techniques? Is there a characterization of problems for which a direct solution always exists? How can high-level IE methods be evaluated?

  • Pre- and post-processing techniques:
    Which high-level IE applications benefit from pre-/post-processing? Can pre-/post-processing be harmful? Are these techniques independent of the underlying IE methods? How can pre- and post-processing techniques be evaluated?

  • Applications:
    What are novel applications involving high-level IE? Are there equivalent problems in related areas that can be solved with existing methods?