我有两个数据帧,我想使用某种VLOOKUP函数,它将句子与特定关键字匹配。在下面的例子中,(df1)3e句子应该与banana(df2)匹配,因为它在句子中包含香蕉。
import pandas as pd
df1 = pd.DataFrame({'Text': ['Some text 1', 'Some text 2','The monkey eats a banana','Some text 4']})
df2 = pd.DataFrame({'Keyword': ['apple', 'banana', 'chicken'], 'Type': ['fruit', 'fruit', 'meat']})
df1
Text
0 Some text 1
1 Some text 2
2 The monkey eats a banana
3 Some text 4
df2
Keyword Type
0 apple fruit
1 banana fruit
2 chicken meat
因此,最好的结果是:
Text Type
0 Some text 1 -
1 Some text 2 -
2 The monkey eats a banana fruit
3 Some text 4
-
解决办法:
使用extract的关键字,并且map图中提取到“关键字”到“类型”。
import re
p = rf"({'|'.join(map(re.escape, df2['Keyword']))})"
# p = '(' + '|'.join(map(re.escape, df2['Keyword'])) + ')'
df1['Type'] = (
df1['Text'].str.extract(p, expand=False).map(df2.set_index('Keyword')['Type']))
df1
Text Type
0 Some text 1 NaN
1 Some text 2 NaN
2 The monkey eats a banana fruit
3 Some text 4 NaN








暂无数据