热线电话:13121318867

登录
2019-02-25 阅读量: 1199
找出具有相同形状的两个DataFrame之间的差异

我试图找到两个具有行数的excel文件之间的差异。我首先想要在两列上对两个工作簿进行排序,然后输出具有差异的第三个文件。我无法正确导出差异文件。

任何帮助都非常感谢!!! 提前致谢!

将pandas导入为pd

df1 = pd.DataFrame({

'ID' : ['3', '3', '55','55', '66', '66'],

'date' : [20180102, 20180103, 20180104, 20180105, 20180106, 20180107],

'age': [0, 1, 9, 4, 2, 3],

})

df2 = pd.DataFrame({

'ID' : ['3', '55', '3','66', '55', '66'],

'date' : [20180103, 20180104, 20180102, 20180106, 20180105, 20180107],

'age': [0, 1, 9, 9, 8, 7],

})

df3 = df1.sort_values(by= ['ID', 'date'] , ascending=False)

df4 = df2.sort_values(by= ['ID', 'date'] , ascending=False)

dfDiff = df3.copy()

for row in range(dfDiff.shape[0]):

for col in range(dfDiff.shape[1]):

value_old = df3.iloc[row,col]

value_new = df4.iloc[row,col]

if value_old == value_new:

dfDiff.iloc[row,col] = df4.iloc[row,col]

else:

dfDiff.iloc[row,col] = ('{}->{}').format(value_old,value_new)

writer = pd.ExcelWriter('diff', engine='xlsxwriter')

dfDiff.to_excel(writer, sheet_name='DIFF', index= False)

workbook = writer.book

worksheet = writer.sheets['DIFF']

worksheet.hide_gridlines(2)

解决办法:错过了.xlsx文件路径的末尾

df1 = pd.DataFrame({

'ID' : ['3', '3', '55','55', '66', '66'],

'date' : [20180102, 20180103, 20180104, 20180105, 20180106, 20180107],

'age': [0, 1, 9, 4, 2, 3],

})

df2 = pd.DataFrame({

'ID' : ['3', '55', '3','66', '55', '66'],

'date' : [20180103, 20180104, 20180102, 20180106, 20180105, 20180107],

'age': [0, 1, 9, 9, 8, 7],

})

df3 = df1.sort_values(by= ['ID', 'date'] , ascending=False)

df4 = df2.sort_values(by= ['ID', 'date'] , ascending=False)

dfDiff = df3.copy()

for row in range(dfDiff.shape[0]):

for col in range(dfDiff.shape[1]):

value_old = df3.iloc[row,col]

value_new = df4.iloc[row,col]

if value_old == value_new:

dfDiff.iloc[row,col] = df4.iloc[row,col]

else:

dfDiff.iloc[row,col] = ('{}->{}').format(value_old,value_new)

# added `.xlsx' to path here

writer = pd.ExcelWriter('diff.xlsx', engine='xlsxwriter')

writer.save()

0.0000
4
关注作者
收藏
评论(0)

发表评论

暂无数据
推荐帖子