• If you are citizen of an European Union member nation, you may not use this service unless you are at least 16 years old.

  • Stop wasting time looking for files and revisions! Dokkio, a new product from the PBworks team, integrates and organizes your Drive, Dropbox, Box, Slack and Gmail files. Sign up for free.

View
 

Stanford POS Tagger

DiRT (Digital Research Tools) has a new home! Please visit Bamboo DiRT to explore this excellent collection of research tools.

Web site: The Stanford Natural Language Processing Group

Date of first review/ name of reviewer: 5/29/08 [Matthew Jockers]

Additional reviewer(s):

Produced by:  The Stanford Natural Language Processing Group

Cost: Free

Description: "A Java implementation of a maximum-entropy (CMM) part-of-speech (POS) tagger."

Platform: Java, Command Line Application

License:The tagger is licensed under the GNU GPL.

Maturity: v1.5

 

Features:

  • Tags Parts of Speech using Penn Treebank Tag set
  • Customizable
  • Ability to tag XML content (added in v1.5)
  • Ability to produce XML results instead of default Penn Treebank

 

Advantages:

  • Comes with several pre-trained models but can easily be retrained
  • Full API for Developers
  • Simple to set up and install--The system requires Java 1.5+ to be installed.
  • Growing user community with an email list

 

Disadvantages:

  • There is a GUI, but it is only for demo purposes. You must be comfortable with the command line to really use this tool

 

Tips:

 

Tutorials:

  • See tips section above

 

More information:

  •  

Comments (0)

You don't have permission to comment on this page.