热线电话:13121318867

登录
2019-02-12 阅读量: 1091
关于boost::accumulators的困惑

我使用pandas和获得不同的统计计算结果boost::accumulators,并且不确定原因。

我有一个简单的例子,使用pandas来计算某些回报的均值和方差

import pandas

vals = [ 1, 1, 2, 1, 3, 2, 3, 4, 6, 3, 2, 1 ]

rets = pandas.Series(vals).pct_change()

print(f'count: {len(rets)}')

print(f'mean: {rets.mean()}')

print(f'variance: {rets.var()}')

这个输出是:

count: 12

mean: 0.19696969696969696

variance: 0.6156565656565657

我在C ++中boost::accumulators用于统计数据计算

#include <iostream>

#include <iomanip>

#include <cmath>

#include <boost/accumulators/accumulators.hpp>

#include <boost/accumulators/statistics/stats.hpp>

#include <boost/accumulators/statistics/count.hpp>

#include <boost/accumulators/statistics/mean.hpp>

#include <boost/accumulators/statistics/variance.hpp>

namespace acc = boost::accumulators;

int main()

{

acc::accumulator_set<double, acc::stats<acc::tag::count,

acc::tag::mean,

acc::tag::variance>> stats;

double prev = NAN;

for (double val : { 1, 1, 2, 1, 3, 2, 3, 4, 6, 3, 2, 1 })

{

const double ret = (val - prev) / prev;

stats(std::isnan(ret) ? 0 : ret);

prev = val;

}

std::cout << std::setprecision(16)

<< "count: " << acc::count(stats) << '\n'

<< "mean: " << acc::mean(stats) << '\n'

<< "variance: " << acc::variance(stats) << '\n';

return 0;

}

这个输出是:

count: 12

mean: 0.1805555555555556

variance: 0.5160108024691359

#########################################

原来这里的区别是因为在pandas中,nan当你mean通过defualt 执行时它将删除列,如果我们填充nan为0,输出是相同的,因为你这样做pct_change,第一项应该是NaN

rets.mean()

Out[67]: 0.19696969696969696

rets.fillna(0).mean()

Out[69]: 0.18055555555555555

关于var使自由为0

rets.fillna(0).var(ddof=0)

Out[86]: 0.5160108024691358

0.0000
3
关注作者
收藏
评论(0)

发表评论

暂无数据
推荐帖子