This is one of the suggested topics for the course Language Technology Project 2005.
Readers often need to look up phrases, usually named entities. In the case of electronic documents, it would be useful if those phrases were linked to relevant online documents or relevant encyclopedic information. An example of automatically annotated web documents are the pages of Willem Robert van Hage. However, on these pages every individual word receives a hyperlink and often the target information of the links is irrelevant.
Develop and build a system that enriches documents (plain text or HTML) with hyperlinks to relevant information for named entity phrases. Compare the relevance quality of the target online documents with encyclopedic information. Integrate the system in preprocessors for text, HTML and e-mail.
The suggested encyclopedia for this project is the free online Wikipedia encyclopedia. The proposed target language is English.
The system will be evaluated with documents which will be supplied by the teachers. Important questions will be: "which phrases receive hyperlinks?", "how relevant are the suggested links?" and "does the interface work well?".