Zhang Jiuan’ Notes

NLP常用开源/免费工具

最好的开源网站http://sourceforge.net/
项目前见有打对号的,是我已亲自尝试了,是可用且开源的。

*Computational Linguistics Toolbox
CLT http://complingone.georgetown.edu/~linguist/compling.html
GATE http://gate.ac.uk/
Natural Language Toolkit(NLTK) http://nltk.org
MALLET http://mallet.cs.umass.edu/index.php/Main_Page

*English Stemmer
Snowball http://snowball.tartarus.org/

*English POS Tagger
Stanford POS Tagger http://nlp.stanford.edu/software/tagger.shtml
TreeTagger http://www.ims.uni-stuttgart.de/projekte/corplex/TreeTagger/

*English Parser
Stanford Parser http://nlp.stanford.edu/software/lex-parser.shtml
Berkeley Parser http://nlp.cs.berkeley.edu/Main.html#Parsing

*English Keyphrase Extractor
KEA http://www.nzdl.org/Kea/index_old.html

*English Name Entity Recognizer
Stanford NER http://nlp.stanford.edu/software/CRF-NER.shtml

*Chinese Word Segmentator
√中科院ICTCLAS http://www.nlp.org.cn/project/project.php?proj_id=6
Stanford Word Segmenter http://nlp.stanford.edu/software/segmenter.shtml

*Topic Modeling Tools
Matlab http://psiexp.ss.uci.edu/research/programs_data/toolbox.htm

*Machine Learning Methods
CRF++ http://crfpp.sourceforge.net/
LIBSVM http://www.csie.ntu.edu.tw/~cjlin/libsvm/
Donate http://sourceforge.net/projects/java-ml/

*Search Engines
√Lucene http://lucene.apache.org/
中科院FirteX http://www.firtex.org/
√Lemur http://www.lemurproject.org/

*Data Mining Toolbox
Weka http://www.cs.waikato.ac.nz/ml/weka/
Databionic ESOM Tools http://sourceforge.net/projects/databionic-esom/

*Clustering
√Carrot2 http://project.carrot2.org/
ExtMiner http://sourceforge.net/projects/extminer/
SHReC http://sourceforge.net/projects/shrec/
brCluster http://sourceforge.net/projects/brcluster/
AggClustering http://sourceforge.net/projects/aggclustering/

*Text Processing
Word Vector Tool http://sourceforge.net/projects/wvtool/
LingPipe http://lingpipe-blog.com/ http://alias-i.com/lingpipe/

If you enjoyed this post, make sure you subscribe to my RSS feed!

No Comments, Comment or Ping

Reply to “NLP常用开源/免费工具”

You must be logged in to post a comment.

返回顶部