数据科学专业问答社区，好文章，一字千金--CDA答疑社区

补个图

0 0 0

wangxishi

2020-11-05

python如何实现根据条件计算日期相差天数

算出来的列类型是pd.Timedelta，实际上可以视为时间戳，可以通过其他方式转换为数值

import pandas as pd
import datetime
delay_time = datetime.datetime.strptime('2020-10-31', '%Y-%m-%d')
df =pd.read_excel("C:\\Users\\Administrator\\Downloads\\1604563522_368980.xlsx")
df['time'] = pd.to_datetime(df['time'], unit='d')
df.loc[df['标识']=='推迟','天数_py'] =  delay_time + datetime.timedelta(days=1) - df.loc[df['标识']=='推迟','time'] 
df.loc[df['标识']!='推迟','天数_py'] = pd.Timedelta(seconds=0)

1 0 2

wangxishi

2020-11-05

python如何实现根据条件计算日期相差天数

想用python实现

0 0 0

wangxishi

2020-11-05

python如何实现根据条件计算日期相差天数

想用python实现，对，导入是Dataframe

0 0 0

wangxishi

2020-11-05

python如何实现根据条件计算日期相差天数

使用excel的话，在C1输入 =IF(B2="推迟",44135-A2+1,0)，用py的话源是csv导入到Dataframe吗？

1 0 1

wangxishi

2020-11-05

python如何实现根据条件计算日期相差天数

问题已解决，谢谢！

0 0 0

wangxishi

2020-11-05

python如何实现根据条件计算日期相差天数

谢谢！

0 0 0

wangxishi

2020-11-05

python 根据日期条件计数

看周转率默认是左对齐的说明是文本格式，使用数据分列也可以批量变成数值格式

0 0 0

读童话的狼

2020-11-05

周转率要以百分比展示, 我这百分号出不来?

解决了，非常感谢！

1 0 0

ermutuxia

2020-11-05

python如何检验一个数值是否属于某个区间？

import pandas as pd
data=pd.read_excel("D:\\360安全浏览器下载\\1604548797_924061.xlsx")
data["计算列1"]=data["time"].map(lambda x:1 if x>pd.Timestamp("2020-1-1") else 0)

重点在于字符串变成Timestamp对象之后才能和Timestamp对象进行大小比较

1 0 0

wangxishi

2020-11-05

python 根据日期条件计数

你的错误在于这一句：

b = CustomBusinessDay(holidays)
 CustomBusinessDay这个类里面的holidays参数是一个关键字参数，不是一个位置参数，所以你进行实例化类的时候应该这样写

b = CustomBusinessDay(holidays=holidays)

这个类的部分帮助文件如下

help(CustomBusinessDay)

Help on class CustomBusinessDay in module pandas.tseries.offsets:

class CustomBusinessDay(_CustomMixin, BusinessDay)

| CustomBusinessDay(n=1, normalize=False, weekmask='Mon Tue Wed Thu Fri', holidays=None, calendar=None, offset=datetime.timedelta(0))

|

| DateOffset subclass representing possibly n custom business days,

| excluding holidays.

1 0 1

ermutuxia

2020-11-05

python如何检验一个数值是否属于某个区间？

谢谢，解决了~

0 0 0

wangxishi

2020-11-05

python 如何计算两个日期的相差月数

我上传个数据，
数据.xlsx
帮我看一下，谢谢!

0 0 0

ermutuxia

2020-11-05

python如何检验一个数值是否属于某个区间？

int() argument must be a string, a bytes-like object or a number, not 'list'

During handling of the above exception, another exception occurred:

TypeError: `n` argument must be an integer, got <class 'list'>

0 0 0

ermutuxia

2020-11-05

python如何检验一个数值是否属于某个区间？

TypeError                                 Traceback (most recent call last)pandas\_libs\tslibs\offsets.pyx in pandas._libs.tslibs.offsets.BaseOffset._validate_n()TypeError: int() argument must be a string, a bytes-like object or a number, not 'list'

During handling of the above exception, another exception occurred:TypeError                                 Traceback (most recent call last)<ipython-input-100-eab4994fcd77> in <module>     22         count_businessday(start_day,end_day)     23 ---> 24 sheet['工作日'] = sheet.apply(lambda x:count_businessday(x['出院时间'],x['归档日期']),axis=1)D:\ProgramData\Anaconda3\lib\site-packages\pandas\core\frame.py in apply(self, func, axis, raw, result_type, args, **kwds)   7546             kwds=kwds,   7547         )-> 7548         return op.get_result()   7549    7550     def applymap(self, func) -> "DataFrame":D:\ProgramData\Anaconda3\lib\site-packages\pandas\core\apply.py in get_result(self)    178             return self.apply_raw()    179 --> 180         return self.apply_standard()    181     182     def apply_empty_result(self):D:\ProgramData\Anaconda3\lib\site-packages\pandas\core\apply.py in apply_standard(self)    269     270     def apply_standard(self):--> 271         results, res_index = self.apply_series_generator()    272     273         # wrap resultsD:\ProgramData\Anaconda3\lib\site-packages\pandas\core\apply.py in apply_series_generator(self)    298                 for i, v in enumerate(series_gen):    299                     # ignore SettingWithCopy here in case the user mutates--> 300                     results[i] = self.f(v)    301                     if isinstance(results[i], ABCSeries):    302                         # If we have a view on v, we need to make a copy because<ipython-input-100-eab4994fcd77> in <lambda>(x)     22         count_businessday(start_day,end_day)     23 ---> 24 sheet['工作日'] = sheet.apply(lambda x:count_businessday(x['出院时间'],x['归档日期']),axis=1)<ipython-input-100-eab4994fcd77> in count_businessday(start_day, end_day)      4                                     '2020-10-05','2020-10-06','2020-10-07','2020-10-08']      5     holidays=[pd.Timestamp(i) for i in holidays]----> 6     b = CustomBusinessDay(holidays)      7     bus_day = pd.date_range(start=start_day, end=end_day, freq=b)      8     length = len(bus_day)pandas\_libs\tslibs\offsets.pyx in pandas._libs.tslibs.offsets.CustomBusinessDay.__init__()pandas\_libs\tslibs\offsets.pyx in pandas._libs.tslibs.offsets.BusinessMixin.__init__()pandas\_libs\tslibs\offsets.pyx in pandas._libs.tslibs.offsets.BaseOffset.__init__()pandas\_libs\tslibs\offsets.pyx in pandas._libs.tslibs.offsets.BaseOffset._validate_n()TypeError: `n` argument must be an integer, got <class 'list'>

0 0 0

ermutuxia

2020-11-05

python如何检验一个数值是否属于某个区间？

from pandas.tseries.offsets import CustomBusinessDay
def count_businessday(start_day,end_day):
    holidays=['2020-04-06','2020-05-01','2020-05-04','2020-05-05','2020-06-25','2020-10-01',
                                    '2020-10-05','2020-10-06','2020-10-07','2020-10-08']
    holidays=[pd.Timestamp(i) for i in holidays]
    b = CustomBusinessDay(holidays)
    bus_day = pd.date_range(start=start_day, end=end_day, freq=b)
    length = len(bus_day)
 
    extra_work_day = ['2020-04-26','2020-05-09','2020-06-28','2020-09-27','2020-10-10']
    extra_work_day = [pd.Timestamp(i) for i in extra_work_day]
    
    extra_len = 0
     
    for i in extra_work_day:
        if  i  >= start_day and i <= end_day:
            extra_len = extra_len+1
 
    return(length+extra_len-1)
 
    if __name__=='__main__':
        count_businessday(start_day,end_day)
        
sheet['工作日'] = sheet.apply(lambda x:count_businessday(x['出院时间'],x['归档日期']),axis=1)

0 0 0

ermutuxia

2020-11-05

python如何检验一个数值是否属于某个区间？

我研究一下，谢谢！

0 0 0

ermutuxia

2020-11-05

python如何检验一个数值是否属于某个区间？

问题在于你的extra_work_day= ['2020-04-26','2020-05-09','2020-06-28','2020-09-27','2020-10-10']
这个列表里面的元素是字符串，所以不能和时间戳对象进行比较，你需要变一下，把他变成时间戳对象，可以通过下面的方式：
extra_work_day= ['2020-04-26','2020-05-09','2020-06-28','2020-09-27','2020-10-10']

extra_work_day=[pd.Timestamp(i) for i in extra_work_day]

然后你就可以用原来的命令比较啦

1 0 1

ermutuxia

2020-11-05

python如何检验一个数值是否属于某个区间？

from pandas.tseries.offsets import CustomBusinessDay
def count_businessday(start_day,end_day):
    b = CustomBusinessDay(holidays=['2020-04-06','2020-05-01','2020-05-04','2020-05-05','2020-06-25','2020-10-01',
                                    '2020-10-05','2020-10-06','2020-10-07','2020-10-08'])
    bus_day = pd.date_range(start=start_day, end=end_day, freq=b)
    length = len(bus_day)
 
    extra_work_day = ['2020-04-26','2020-05-09','2020-06-28','2020-09-27','2020-10-10']
    extra_len = 0
    dayday = Interval(start_day,end_day)
    for i in extra_work_day:
        if  i  in dayday: 
            extra_len = extra_len+1
 
    return(length+extra_len-1)
 
    if __name__=='__main__':
        count_businessday(start_day,end_day)

sheet['工作日'] = sheet.apply(lambda x:count_businessday(x['出院时间'],x['归档日期']),axis=1)

出现这个错误：

'<' not supported between instances of 'str' and 'Timestamp'

1 0 0

ermutuxia

2020-11-05

python如何检验一个数值是否属于某个区间？

更正：

上面的安装命令有缺少 autoprp8,正确安装命令如下：

pip install -i https://pypi.douban.com/simple autopep8

0 0 0

PGC123

2020-11-05

补个图

想用python实现

想用python实现，对，导入是Dataframe

使用excel的话，在C1输入 =IF(B2="推迟",44135-A2+1,0)，用py的话源是csv导入到Dataframe吗？

问题已解决，谢谢！

谢谢！

看周转率默认是左对齐的说明是文本格式，使用数据分列也可以批量变成数值格式

解决了，非常感谢！

import pandas as pd data=pd.read_excel("D:\\360安全浏览器下载\\1604548797_924061.xlsx") data["计算列1"]=data["time"].map(lambda x:1 if x>pd.Timestamp("2020-1-1") else 0)重点在于字符串变成Timestamp对象之后才能和Timestamp对象进行大小比较

谢谢，解决了~

我上传个数据，数据.xlsx帮我看一下，谢谢!

int() argument must be a string, a bytes-like object or a number, not 'list'During handling of the above exception, another exception occurred:TypeError: `n` argument must be an integer, got <class 'list'>

我研究一下，谢谢！

更正：上面的安装命令有缺少 autoprp8,正确安装命令如下：pip install -i https://pypi.douban.com/simple autopep8

import pandas as pd data=pd.read_excel("D:\\360安全浏览器下载\\1604548797_924061.xlsx") data["计算列1"]=data["time"].map(lambda x:1 if x>pd.Timestamp("2020-1-1") else 0)
重点在于字符串变成Timestamp对象之后才能和Timestamp对象进行大小比较

我上传个数据，
数据.xlsx
帮我看一下，谢谢!

int() argument must be a string, a bytes-like object or a number, not 'list'
During handling of the above exception, another exception occurred:
TypeError: `n` argument must be an integer, got <class 'list'>

更正：
上面的安装命令有缺少 autoprp8,正确安装命令如下：
pip install -i https://pypi.douban.com/simple autopep8