Named Entity Linking

This is one of the suggested topics for the course Language Technology Project 2005.

Introduction

Readers often need to look up phrases, usually named entities. In the case of electronic documents, it would be useful if those phrases were linked to relevant online documents or relevant encyclopedic information. An example of automatically annotated web documents are the pages of Willem Robert van Hage. However, on these pages every individual word receives a hyperlink and often the target information of the links is irrelevant.

Task

Develop and build a system that enriches documents (plain text or HTML) with hyperlinks to relevant information for named entity phrases. Compare the relevance quality of the target online documents with encyclopedic information. Integrate the system in preprocessors for text, HTML and e-mail.

The suggested encyclopedia for this project is the free online Wikipedia encyclopedia. The proposed target language is English.

Modules

The system will be evaluated with documents which will be supplied by the teachers. Important questions will be: "which phrases receive hyperlinks?", "how relevant are the suggested links?" and "does the interface work well?".

Literature and tools


Previous topic | Home | Next topic
Last update: January 04, 2004, erikt@science.uva.nl