NP Chunking

Dividing sentences into non-overlapping phrases is called text chunking. NP chunking deals with a part of this task: it involves recognizing the chunks that consist of noun phrases (NPs). A standard data set for this task was put forward by Lance Ramshaw and Mitch Marcus in their 1995 WVLC paper [RM95]. The data has been divided in two parts: training data and test data. The goal is to make a machine learning algorithm learn the training data and evaluate its performance by testing it with the testing data.

The performance of the algorithm is measured with two scores: precision and recall. Precision measures how many NPs found by the algorithm are correct and the recall rate contains the percentage of NPs defined in the corpus that were found by the chunking program. The two rates can be combined in one measure: the F rate in which F = 2*precision*recall / (recall+precision) [Rij79].

The standard data set put forward by Ramshaw and Marcus consists of sections 15-18 of the Wall Street Journal corpus as training material and section 20 of that corpus as test material. Here are the published results for this data set:

              +-----------+-----------++-----------++
              | precision |   recall  ||     F     ||
   +----------+-----------+-----------++-----------++
   | [KM01]   |   94.15%  |   94.29%  ||   94.22   ||
   | [TDD+00] |   94.18%  |   93.55%  ||   93.86   ||
   | [TKS00]  |   93.63%  |   92.89%  ||   93.26   ||
   | [MPRZ99] |   92.4%   |   93.1%   ||   92.8    ||
   | [XTAG99] |   91.8%   |   93.0%   ||   92.4    ||
   | [TV99]   |   92.50%  |   92.25%  ||   92.37   || 
   | [RM95]   |   91.80%  |   92.27%  ||   92.03   || 
   | [ADK99]  |   91.6%   |   91.6%   ||   91.6    || 
   | [Vee98]  |   89.0%   |   94.3%   ||   91.6    || 
   | [CP98]   |   90.7%   |   91.1%   ||   90.9    || 
   | [CP99]   |   89.0%   |   90.9%   ||   89.9    || 
   +----------+-----------+-----------++-----------++
   | baseline |   78.20%  |   81.87%  ||   79.99   ||
   +----------+-----------+-----------++-----------++

The results of [ADK99], [CP98] and [CP99] have been obtained without using lexical information, that is with part-of-speech tags only. The baseline results were produced by a system that assigned the most frequent chunk tag to each part-of-speech tag.

[RM95] has also reported work on a larger task: using sections 02-21 of the WSJ corpus as training material and section 00 for testing. Learning algorithms achieve a better performance than for the previous task because of the larger size of the training data. The published results for this data set are:

              +-----------+-----------++-----------++
              | precision |  recall   ||     F     ||
   +----------+-----------+-----------++-----------++
   | [KM01]   |   95.62%  |   95.93%  ||   95.77   ||
   | [TKS00]  |   95.04%  |   94.75%  ||   94.90   ||
   | [TV99]   |   93.71%  |   93.90%  ||   93.81   ||
   | [RM95]   |   93.1%   |   93.5%   ||   93.3    ||
   +----------+-----------+-----------++-----------++

Other languages: [KK99] have reported NP chunking results for Swedish. [SB99] have published results for German. [ZH98] have presented a model for analyzing Chinese.

Software and Data

ftp://ftp.cis.upenn.edu/pub/chunker/
The original data of the NP chunking experiments by Lance Ramshaw and Mitch Marcus. Their NP chunker and their WVLC95 paper can also be obtained from this site. The data contains one word per line and each line contains six fields of which only the first three fields are relevant: the word, the part-of-speech tag assigned by the Brill tagger and the correct IOB tag.
http://staff.science.uva.nl/~erikt/software/wsj2rm
A Perl script by Erik Tjong Kim Sang for converting files of the Wall Street Journal format to the format used by [RM95]. If you want to work with the same data as [RM95] you will also need the Brill tagger for deriving the part-of-speech tags:
ftp://ftp.cs.jhu.edu/pub/brill/Programs/RULE_BASED_TAGGER_V.1.14.tar.Z
http://staff.science.uva.nl/~erikt/npcombi/
Overview of an experiment in which NP chunking is done by combining the results of several machine learning techniques.
http://staff.science.uva.nl/~erikt/research/chunkdemo.html
A demonstration of a small NP chunking program trained on a part of the Wall Street Journal corpus. Supplied by Erik Tjong Kim Sang from the University of Antwerp.
http://l2r.cs.uiuc.edu/~cogcomp/eoh/nounphrase.html
A noun phrase recognition demo provided by the Cognitive Computation Group at the University of Illinois at Urbana.
http://www.ltg.ed.ac.uk/software/posdemo.html
LT POS and LT CHUNK: a demo of a part-of-speech tagger and a noun phrase chunker provided by Language Technology Group in Edinburgh.
http://www.dfki.de/~skut/chunkie/demo.html
A tagging and chunking demo for German. Provided by Wojciech Skut of DFKI in Saarbrücken.
http://www.sfb441.uni-tuebingen.de/~dejean/lcg/chunker.html
A demo provided by Hervé Déjean from the University of Tübingen. It recognizes NP chunks and VP chunks.
http://ilk.kub.nl/cgi-bin/chunkdemo/demo.pl
A demo from Tilburg University of a set of memory-based learning programs that perform tagging, chunking and detection of subjects and objects. Recommended!

Related information

http://www.cnts.ua.ac.be/conll2000/chunking/
A description of the shared task for the CoNLL-2000 workshop: text chunking.
http://www.cnts.ua.ac.be/conll99/npb/
A description of the NP bracketing task which was suggested as a follow-up for the NP chunking task. The first results of this task were presented at the CoNLL-99 workshop.
http://www.cnts.ua.ac.be/lcg/
Home page of the TMR network - Learning Computational Grammars.

References

[ADK99]
Shlomo Argamon and Ido Dagan and Yuval Krymolowski, A Memory-Based Approach to Learning Shallow Natural Language Patterns. Journal of Experimental and Theoretical Artificial Intelligence (JETAI), volume 11 (3), 1999.
http://xxx.lanl.gov/ps/cmp-lg/9806011
[BN99]
Eric Brill and Grace Ngai, Man vs. Machine: A Case Study in Base Noun Phrase Learning. In: "Proceedings of ACL'99", University of Maryland, MD, USA, 1999.
http://www.cs.jhu.edu/~gyn/publications/acl99.ps
[CC94]
Kuang-hua Chen and Hsin-Hsi Chen, Extracting Noun Phrases from Large-Scale Texts: A Hybrid Approach and Its Automatic Evaluation, In: "Proceedings of ACL-94", Las Cruses, NM, USA, 1994.
http://www.arxiv.org/abs/cmp-lg/9405034
[Chu88]
Kenneth Ward Church, A Stochastic Parts Programs and Noun Phrase Parser for Unrestricted Text. In: "Proceedings of ANLP-88", Austin, TX, USA, 1988.
http://www.cs.berkeley.edu/~wilensky/c288/church-tagging.ps
[CP98]
Claire Cardie and David Pierce, Error-Driven Pruning of Treebank Grammars for Base Noun Phrase Identification. In: "Proceedings of COLING-ACL'98", Montreal, Canada, 1998.
http://xxx.lanl.gov/ps/cmp-lg/9808015
[CP99]
Claire Cardie and David Pierce, The Role of Lexicalization and Pruning for Base Noun Phrase Grammars. In: "Proceedings of the Sixteenth National Conference on Artificial Intelligence", 1999.
http://www.cs.cornell.edu/home/pierce/papers/aaai99.html
[Kin01]
Alexandra Kinyon. A language-independent shallow-parser Compiler. In: "Proceedings of ACL-2001", Toulouse, France, 2001.
[KK99]
Dimitrios Kokkinakis and Sofie Johansson Kokkinakis, A Cascaded Finite-State Parser for Syntactic Analysis of Swedish, In: "Proceedings of EACL'99", Bergen, Norway, 1999.
http://svenska.gu.se/~svedk/publics/eaclKokk.ps
[KM01]
Taku Kudo and Yuji Matsumoto, Chunking with Support Vector Machines, In: "Proceedings of NAACL 2001", Pittsburgh, PA, USA, 2001.
http://cactus.aist-nara.ac.jp/~taku-ku/publications/naacl2001.pdf
[Kry01]
Yuval Krymolowski, Using the Distribution of Performance for Studying Statistical NLP Systems and Corpora, In: "Proceedings of the ACL/EACL Workshop on Evaluation for Language and Dialogue Systems", Toulouse, France, 2001.
http://arXiv.org/abs/cs/0106043
[MSM93]
Mitchell P. Marcus, Beatrice Santorini and Mary Ann Marcinkiewicz, Building a large annotated corpus of English: the Penn Treebank, In: "Computational Linguistics", 19:2, 1993.
http://morph.ldc.upenn.edu/Catalog/docs/treebank2/cl93.html (paper)
http://morph.ldc.upenn.edu/Catalog/LDC95T7.html (corpus information)
[MPRZ99]
Marcia Muñoz, Vasin Punyakanok, Dan Roth and Dav Zimak, A Learning Approach to Shallow Parsing, In: "Proceedings of EMNLP/WVLC-99", University of Maryland, MD, USA, 1999.
http://l2r.cs.uiuc.edu/~danr/Papers/emnlp99.ps.gz
[NY00]
Grace Ngai and David Yarowsky, Rule Writing or Annotation: Cost-efficient Resource Usage for Base Noun Phrase Chunking, In: "Proceedings of ACL-2000", Hong Kong, 2000.
http://www.cs.jhu.edu/~gyn/publications/acl2000.ps
[PMP00]
Ferran Pla, Antonio Molina and Natividad Prieto, Tagging and Chunking with Bigrams, In: "Proceedings of Coling 2000", Saarbruecken, Germany, 2000.
http://www.dsic.upv.es/~fpla/ARTICLES/artCol2000def.ps.gz
[PR01]
Vasin Punyakanok and Dan Roth, The Use of Classifiers in Sequential Inference, In: "Advances in Neural Information Processing Systems 13", MIT Press, 2001.
http://arXiv.org/abs/cs/0111003
[Rij79]
C.J. van Rijsbergen, "Information Retrieval". Buttersworth, 1979.
[RM95]
Lance A. Ramshaw and Mitchell P. Marcus, Text Chunking Using Transformation-Based Learning, In: "Proceedings of the Third Workshop on Very Large Corpora", Cambridge, MA, USA, 1995.
ftp://ftp.cis.upenn.edu/pub/chunker/wvlcbook.ps.gz
[SB98]
Wojciech Skut and Thorsten Brants, A Maximum-Entropy Partial Parser for Unrestricted Text. In: "Proceedings of the Sixth Workshop on Very Large Corpora", Montreal, Canada, 1998.
http://www.coli.uni-sb.de/~thorsten/publications/index.html#WVLC98
[SB98b]
Wojciech Skut and Thorsten Brants, Chunk Tagger - Statistical Recognition of Noun Phrases, In: ESSLLI-98 Workshop on Automated Acquisition of Syntax and Parsing, Saarbrücken, 1998.
http://www.arxiv.org/ps/cmp-lg/9807007
[TDD+00]
Erik F. Tjong Kim Sang, Walter Daelemans, Hervé Déjean, Rob Koeling, Yuval Krymolowski, Vasin Punyakanok and Dan Roth, Applying System Combination to Base Noun Phrase Identification. In: "Proceedings of COLING 2000", Saarbrücken, Germany, 2000.
http://staff.science.uva.nl/~erikt/papers/coling2000.ps.gz
[TKS00]
Erik F. Tjong Kim Sang, Noun Phrase Representation by System Combination. In: "Proceedings of ANLP-NAACL 2000", Seattle, WA, USA, 2000.
http://staff.science.uva.nl/~erikt/papers/naacl2000.ps
[TKS02]
Erik F. Tjong Kim Sang, Memory-Based Shallow Parsing. In: "Journal of Machine Learning Research", volume 2 (March), 2002, pp. 559-594, 2002.
http://arxiv.org/abs/cs.CL/0204049
[TV99]
Erik F. Tjong Kim Sang and Jorn Veenstra, Representing Text Chunks. In: "Proceedings of EACL'99", Bergen, Norway, 1999.
http://xxx.lanl.gov/abs/cs.CL/9907006
[Vee98]
Jorn Veenstra, Fast NP chunking using memory-based learning techniques, In: F. Verdenius and W. van den Broek eds., "Proceedings of BENELEARN-98", Wageningen, The Netherlands, 1998.
http://ilk.kub.nl/downloads/pub/papers/ilk.9807.ps.gz
[Vou93]
Atro Voutilainen, NPtool, a detector of English noun phrases. In: "Proceedings of Workshop on Very Large Corpora", Ohio State University, OH, USA, 1993.
http://www.lingsoft.fi/doc/nptool/intro/ (also cmp-lg/9502010)
[XHZ00]
Endong Xun , Chiangning Huang and Ming Zhou, A Unified Statistical Model for the Identification of English BaseNP, In: "Proceedings of ACL 2000", Hong Kong, 2000.
[XTAG98]
The XTAG Research Group, "A Lexicalized Tree Adjoining Grammar for English", IRCS Tech Report 98-18, University of Pennsylvania, PA, USA, 1998.
http://www.cis.upenn.edu/~xtag/tech-report/ (also cs.CL/9809024)
[ZH98]
Zhao Jun and Huang Changning, A Quasi-Dependency Model for Structural Analysis of Chinese BaseNPs. In: "Proceedings of COLING-ACL '98", Montreal, Canada, 1998.

Last update: April 13, 2005. erikt@uia.ua.ac.be