Information extraction:
The identificiation and extraction of instances of a particular
class of events or relationships in a natural language text and
their transformation into a structured representation (e.g. a
database). (after Grishman 1997, Eikvil 1999)
Wrapper Induction:
Automatic generation of wrappers from a few (annotated) sample pages
Assumptions:
Regularity in presentation of information often machine-generated answers to queries
same header
same tail
inbetween a table/list of items that constitute the answer to the query