2019-05-29
阅读量:
594
使用NLTK删除停用词
使用NLTK删除停用词
以下程序从一段文本中删除停用词:
from nltk.corpus import stopwords
from nltk.tokenize import word_tokenize
example_sent = "This is a sample sentence, showing off the stop words filtration."
stop_words = set(stopwords.words('english'))
word_tokens = word_tokenize(example_sent)
filtered_sentence = [w for w in word_tokens if not w in stop_words]
filtered_sentence = []
for w in word_tokens:
if w not in stop_words:
filtered_sentence.append(w)
print(word_tokens)
print(filtered_sentence)






评论(0)


暂无数据
推荐帖子
0条评论
0条评论
0条评论