我有一个dataframe如:
col1 col2 col3 ID
A 23 AZ ER1 ID1
B 12 ZE EZ1 ID2
C 13 RE RE1 ID3
我解析了ID col以获得一些信息,为了快速,我得到一些信息的每个ID,这里是代码的结果:
for i in dataframe['ID']:
name = function(i,ranks=True)
print(name)
{'species': 'rabbit', 'genus': 'unis', 'subfamily': 'logomorphidae', 'family': 'lego', 'no rank': 'info, nothing', 'superkingdom': 'eucoryote'}
{'species': 'dog', 'genus': 'Rana', 'subfamily': 'Alphair', 'family': 'doggidae', 'no rank': 'dsDNA , no stage', 'superkingdom': 'eucaryote'}
{'species': 'duck', 'subfamily': 'duckinae', 'family': 'duckidae'}
...
你可以看到它是一个字典返回。正如您还可以看到ID 1和2我得到6个(species, genus, subfamily, family,no rank,superkingdom) ID 3的信息我只得到3个信息而且这个想法不仅仅是打印dic内容而是直接添加它dataframe并获取:
col1 col2 col3 ID species genus subfamily family no rank superkingdom
A 23 AZ ER1 ID1 rabbit unis logomorphidae lego info, nothing, eucaryote
B 12 ZE EZ1 ID2 dog Rana Alphair doggidae dsDNA , no stage eucaryote
C 13 RE RE1 ID3 duck None duckinae duckidae None None
解决办法:存储在您的输出dict的dicts,因此很容易创建DataFrame和加入回来。
d = {}
for i in dataframe['ID']:
d[i] = taxid.lineage_name(i, ranks=True)
df.merge(pd.DataFrame.from_dict(d, orient='index'), left_on='ID', right_index=True)
输出:
col1 col2 col3 ID species genus subfamily family no rank superkingdom
A 23 AZ ER1 ID1 rabbit unis logomorphidae lego info, nothing eucoryote
B 12 ZE EZ1 ID2 dog Rana Alphair doggidae dsDNA , no stage eucaryote
C 13 RE RE1 ID3 duck NaN duckinae duckidae NaN NaN








暂无数据