热线电话:13121318867

登录
2018-12-19 阅读量: 804
填充缺失值:fillna

fillna() 可以通过几种方式用非NA数据“填写”NA值,我们将说明:

用标量值替换NA

In [41]: df2

Out[41]:

one two three four five timestamp

a NaN 0.501113 -0.355322 bar False NaT

c NaN 0.580967 0.983801 bar False NaT

e 0.057802 0.761948 -0.712964 bar True 2012-01-01

f -0.443160 -0.974602 1.047704 bar False 2012-01-01

h NaN -1.053898 -0.019369 bar False NaT

In [42]: df2.fillna(0)

Out[42]:

one two three four five timestamp

a 0.000000 0.501113 -0.355322 bar False 0

c 0.000000 0.580967 0.983801 bar False 0

e 0.057802 0.761948 -0.712964 bar True 2012-01-01 00:00:00

f -0.443160 -0.974602 1.047704 bar False 2012-01-01 00:00:00

h 0.000000 -1.053898 -0.019369 bar False 0

In [43]: df2['one'].fillna('missing')

Out[43]:

a missing

c missing

e 0.057802

f -0.44316

h missing

Name: one, dtype: object

向前或向后填补空隙

使用与重建索引相同的填充参数,我们可以向前或向后传播非NA值:

In [44]: df

Out[44]:

one two three

a NaN 0.501113 -0.355322

c NaN 0.580967 0.983801

e 0.057802 0.761948 -0.712964

f -0.443160 -0.974602 1.047704

h NaN -1.053898 -0.019369

In [45]: df.fillna(method='pad')

Out[45]:

one two three

a NaN 0.501113 -0.355322

c NaN 0.580967 0.983801

e 0.057802 0.761948 -0.712964

f -0.443160 -0.974602 1.047704

h -0.443160 -1.053898 -0.019369

限制填充量

如果我们只想要填充一定数量的数据点的连续间隙,我们可以使用limit关键字:

In [46]: df

Out[46]:

one two three

a NaN 0.501113 -0.355322

c NaN 0.580967 0.983801

e NaN NaN NaN

f NaN NaN NaN

h NaN -1.053898 -0.019369

In [47]: df.fillna(method='pad', limit=1)

Out[47]:

one two three

a NaN 0.501113 -0.355322

c NaN 0.580967 0.983801

e NaN 0.580967 0.983801

f NaN NaN NaN

h NaN -1.053898 -0.019369

0.0000
5
关注作者
收藏
评论(0)

发表评论

暂无数据
推荐帖子