본문 바로가기

Python

nltk.tag.pos_tag(tokens)

Use NLTK’s currently recommended part of speech tagger to tag the given list of tokens.

>>> from nltk.tag import pos_tag
>>> from nltk.tokenize import word_tokenize
>>> pos_tag(word_tokenize("John's big idea isn't all that bad."))
[('John', 'NNP'), ("'s", 'POS'), ('big', 'JJ'), ('idea', 'NN'), ('is',
'VBZ'), ("n't", 'RB'), ('all', 'DT'), ('that', 'DT'), ('bad', 'JJ'),
('.', '.')]

 

Alphabetical list of part-of-speech tags used in the Penn Treebank Project:

 

[References]

http://www.nltk.org/api/nltk.tag.html?highlight=pos_tag#nltk.tag.pos_tag

http://www.cis.upenn.edu/~treebank/

http://www.ling.upenn.edu/courses/Fall_2003/ling001/penn_treebank_pos.html

 

 

'Python' 카테고리의 다른 글

scipy.ndimage.filters.gaussian_filter1d  (0) 2014.10.06
scipy.signal.argrelextrema  (0) 2014.10.06
scipy.signal.find_peaks_cwt  (0) 2014.10.05
pandas.read_csv  (1) 2014.09.28
numpy.loadtxt vs. numpy.genfromtxt  (0) 2014.09.28