热线电话:13121318867

登录
2019-05-29 阅读量: 594
使用NLTK删除停用词

使用NLTK删除停用词

以下程序从一段文本中删除停用词:

from nltk.corpus import stopwords

from nltk.tokenize import word_tokenize

example_sent = "This is a sample sentence, showing off the stop words filtration."

stop_words = set(stopwords.words('english'))

word_tokens = word_tokenize(example_sent)

filtered_sentence = [w for w in word_tokens if not w in stop_words]

filtered_sentence = []

for w in word_tokens:

if w not in stop_words:

filtered_sentence.append(w)

print(word_tokens)

print(filtered_sentence)

0.0000
5
关注作者
收藏
评论(0)

发表评论

暂无数据
推荐帖子