我有一个如下所示的数据框:
Name Variable Field
A 2.3 412
A 2.9 861
A 3.5 1703
B 3.5 1731
A 4.0 2609
B 4.0 2539
A 4.6 2821
B 4.6 2779
A 5.2 3048
B 5.2 2979
A 5.8 3368
B 5.8 3216
如您所见,我有“变量”列的重复值。我想为A和B之间的每个变量计算delta(%)。我想生成的数据帧是:
Name Variable Field Ref field (A) Delta (A - B)
A 2.3 412 412 0.0%
A 2.9 861 861 0.0%
A 3.5 1703 1703 0.0%
B 3.5 1731 1703 -1.6%
A 4.0 2609 2609 0.0%
B 4.0 2539 2609 2.8%
A 4.6 2821 2821 0.0%
B 4.6 2779 2821 1.5%
A 5.2 3048 3048 0.0%
B 5.2 2979 3048 2.3%
A 5.8 3368 3368 0.0%
B 5.8 3216 3368 4.7%
我已经过一些东西,比如:
df["Ref field (A)"] = df.apply(lambda row:df[(df["Variable"] == row["Variable"]) & (df["Name"] == "A")]["Field"][0],axis=1)
但它只是不起作用......:
File "pandas/_libs/index.pyx", line 106, in pandas._libs.index.IndexEngine.get_value
File "pandas/_libs/index.pyx", line 114, in pandas._libs.index.IndexEngine.get_value
File "pandas/_libs/index.pyx", line 162, in pandas._libs.index.IndexEngine.get_loc
File "pandas/_libs/hashtable_class_helper.pxi", line 958, in pandas._libs.hashtable.Int64HashTable.get_item
File "pandas/_libs/hashtable_class_helper.pxi", line 964, in pandas._libs.hashtable.Int64HashTable.get_item
KeyError: (0, u'occurred at index 0')
解决办法:A'每个'Variable'组只有一个值,创建一个Series并映射值以获取引用。
s = df[df.Name.eq('A')].set_index('Variable').Field
df['RefA'] = df.Variable.map(s)
df['Delta'] = (df.RefA - df.Field)/df.Field*100








暂无数据