可以这样做extract:
df =pd.DataFrame({'text':["Who would have thought this would be so 4347009 difficult",
"24 is me"]})
df['new_col'] = df['text'].str.extract(r'(\d+)')
text new_col
0 Who would have thought this would be so 434700... 4347009
1 24 is me 24
啊啊啊啊啊吖
2019-01-24
如果要将数据帧转换为csv,则使用utf-8-sig作为编码。它可能工作dataframe.to_csv(filepath,encoding ='utf-8-sig',index = False)
啊啊啊啊啊吖
2019-01-24
describe不会打印任何东西。它返回一个数据帧。
从它的文档:
返回:summary:摘要统计信息的Series / DataFrame
与PyCharm不同,使用的笔记本连接到自动打印语句的返回值。
更改cities.describe()到print(cities.describe())。
啊啊啊啊啊吖
2019-01-24
原来是由于pandas merge默认为内连接,因此当您不特定方法时how,它只会在两个dfs中输出该行
例如 :
df1=pd.DataFrame(['a'],columns=['names'])
df2=pd.DataFrame(['b','e','a','c','d'],columns=['names'])
pd.merge(df1.reset_index(), df2.reset_index(), on=['names'])
index_x names index_y
0 0 a 2
df1=pd.DataFrame(['a','a'],columns=['names'])
df2=pd.DataFrame(['b','e','a','a','c','d'],columns=['names'])
df1.merge(df2)
names
0 a
1 a
2 a
3 a
啊啊啊啊啊吖
2019-01-24
在尝试获取文本之前,您需要检查项目是否为无。
for items in soup.find_all("url"):
getTitle = items.find('image:title')
if getTitle is not None:
item = getTitle.text
url = items.find("loc").text
print (item,url)
啊啊啊啊啊吖
2019-01-24
如果用这个相当讨厌的CSS选择器选中一个复选框的周围div,你至少可以点击一个没有例外的复选框。
checkbox = driver.find_element_by_css_selector("#MainContentPlaceHolder_BaseContentPlaceHolder_pmainedge2edge4_0_ctl00_ctl14_dealerFilters > section:nth-child(1) > div:nth-child(1) > div:nth-child(1) > ul:nth-child(1) > li:nth-child(4) > div:nth-child(1)")
checkbox.click()
有很多JavaScript干扰了webdriver自动化。我还没有找到更好的解决方案,但至少你知道有一种方法可以与该复选框进行交互。
啊啊啊啊啊吖
2019-01-23
可以使用pd.MultiIndex.from_product
这样的一些变化:
In [24]: x = pd.date_range('2019-01-01', '2019-04-01', freq='MS')
In [25]: y = ['a', 'b', 'c']
In [26]: index = pd.MultiIndex.from_product([x, y])
In [27]: for ix in index:
...: print(ix)
...:
...:
...:
(Timestamp('2019-01-01 00:00:00', freq='MS'), 'a')
(Timestamp('2019-01-01 00:00:00', freq='MS'), 'b')
(Timestamp('2019-01-01 00:00:00', freq='MS'), 'c')
(Timestamp('2019-02-01 00:00:00', freq='MS'), 'a')
(Timestamp('2019-02-01 00:00:00', freq='MS'), 'b')
(Timestamp('2019-02-01 00:00:00', freq='MS'), 'c')
(Timestamp('2019-03-01 00:00:00', freq='MS'), 'a')
(Timestamp('2019-03-01 00:00:00', freq='MS'), 'b')
(Timestamp('2019-03-01 00:00:00', freq='MS'), 'c')
(Timestamp('2019-04-01 00:00:00', freq='MS'), 'a')
(Timestamp('2019-04-01 00:00:00', freq='MS'), 'b')
(Timestamp('2019-04-01 00:00:00', freq='MS'), 'c')
啊啊啊啊啊吖
2019-01-23
问题找到了,在为其分配值之前,我需要检查的密钥是否已存在。
# Not sure if str(tuple(i)) will work - regardless apply logic like this to make the Key unique
counter = 0
while((str(tuple(i)) + '_' + str(counter)) in genFit.keys()):
counter += 1
genFit[str(tuple(i) + '_' + str(counter)] = tmp
啊啊啊啊啊吖
2019-01-23
啊啊啊啊啊吖
2019-01-23
啊啊啊啊啊吖
2019-01-22
啊啊啊啊啊吖
2019-01-22
啊啊啊啊啊吖
2019-01-22
啊啊啊啊啊吖
2019-01-21
啊啊啊啊啊吖
2019-01-21
啊啊啊啊啊吖
2019-01-21
yzyz345
2019-01-20
啊啊啊啊啊吖
2019-01-20