将一个excel文件用pd.read命令导入时出现下面的错误

#导入库

import pandas

import matplotlib.pyplot as plt

#将excel数据导入python,导入数据的时候需要根据excel文件的实际情况设定参数

path="D:\\360安全浏览器下载\\PGC_问答站-趋势分析(2020-10-21至2020-11-19).xls"

data=pd.read_excel(path,skiprows=4,skipfooter=2,usecols=[1,2])

File "C:\ProgramData\Anaconda3\lib\site-packages\xlrd\timemachine.py", line 31, in <lambda>

unicode = lambda b, enc: b.decode(enc)

UnicodeDecodeError: 'ascii' codec can't decode byte 0xc7 in position 0: ordinal not in range(128)

错误原因是什么？可以看到是xlrd库的字符串编码方式导致的出错，出现这个错误的根本原因是xlrd库调用时出现问题，可能你会问pandas和xlrd库是什么关系，为什么用pandas库时会提示xlrd库的问题。这是因为pandas库在写关于excel部分功能的时候直接调用了xlrd库的一些原有功能和接口。所以一旦这部分出现问题，也会出现错误提示。

更新xlrd库，发现更新后仍然解决不了问题。出现这个问题的根本原因实际上还是我们这个excel文件的编码特殊性，其他excel文件都是能正常导入的，而这个从全景下载下来的excel数据就是不行。如果非要将这个数据用pandas直接倒入到python中怎么办呢

那么如何解决呢？

方案1.更新xlrd 库，没有解决问题失败

方案2.在pd.read函数中添加encoding="GBK2312" ,还是提示同样的错误，没有解决问题失败，

这是因为pd.read_excel函数不支持编码参数encoding设定。
运行下面两行命令都是一样的结果

data=pd.read_excel(path,skiprows=4,skipfooter=2,encoding="GB2312",usecols=[1,2])

data=pd.read_excel(path,skiprows=4,skipfooter=2,encoding_override="GB2312",usecols=[1,2])

方案3.，将第一个excel文件路径更改为其他IO 没有解决问题失败

pd.read_excel(open(path, 'rb',encoding="GB2312"),sheet_name='趋势分析')

n [337]: pd.read_excel(open(path, 'rb',encoding="GB2312"),sheet_name='趋势分析')

Traceback (most recent call last):

File "<ipython-input-337-a3adbf5d0467>", line 1, in <module>

pd.read_excel(open(path, 'rb',encoding="GB2312"),sheet_name='趋势分析')

ValueError: binary mode doesn't take an encoding argument

方案4.，将第一个excel文件路径更改为其他IO, 成功

data=pd.read_excel(xlrd.open_workbook(path,encoding_override="GB2312"),sheet_name='趋势分析',skiprows=4,skipfooter=2)

我还根据前面的错误提示查阅了一些pandas库和xlrd库的py文件，截图如下

*** No CODEPAGE record, no encoding_override: will use 'ascii'

Traceback (most recent call last):

File "<ipython-input-272-d2ee17e1f122>", line 1, in <module>

data=pd.read_excel("D:\\360安全浏览器下载\\PGC_问答站-趋势分析(2020-10-21至2020-11-19).xls",skiprows=4,skipfooter=2,encoding="gbk",usecols=[1,2])

File "C:\ProgramData\Anaconda3\lib\site-packages\pandas\util\_decorators.py", line 208, in wrapper

return func(*args, **kwargs)

File "C:\ProgramData\Anaconda3\lib\site-packages\pandas\io\excel\_base.py", line 310, in read_excel

io = ExcelFile(io, engine=engine)

File "C:\ProgramData\Anaconda3\lib\site-packages\pandas\io\excel\_base.py", line 819, in __init__

self._reader = self._engines[engine](self._io)

File "C:\ProgramData\Anaconda3\lib\site-packages\pandas\io\excel\_xlrd.py", line 21, in __init__

super().__init__(filepath_or_buffer)

File "C:\ProgramData\Anaconda3\lib\site-packages\pandas\io\excel\_base.py", line 359, in __init__

self.book = self.load_workbook(filepath_or_buffer)

File "C:\ProgramData\Anaconda3\lib\site-packages\pandas\io\excel\_xlrd.py", line 36, in load_workbook

return open_workbook(filepath_or_buffer)

File "C:\ProgramData\Anaconda3\lib\site-packages\xlrd\__init__.py", line 157, in open_workbook

ragged_rows=ragged_rows,

File "C:\ProgramData\Anaconda3\lib\site-packages\xlrd\book.py", line 117, in open_workbook_xls

bk.parse_globals()

File "C:\ProgramData\Anaconda3\lib\site-packages\xlrd\book.py", line 1223, in parse_globals

self.handle_externsheet(data)

File "C:\ProgramData\Anaconda3\lib\site-packages\xlrd\book.py", line 913, in handle_externsheet

sheet_name = unicode(data[2:nc+2], self.encoding)

File "C:\ProgramData\Anaconda3\lib\site-packages\xlrd\timemachine.py", line 31, in <lambda>

unicode = lambda b, enc: b.decode(enc)

UnicodeDecodeError: 'ascii' codec can't decode byte 0xc7 in position 0: ordinal not in range(128)

timemachine.py文件

接着查看xlrd库下面的book.py文件，可以看到是通过encoding encoding_overrid参数进行设定

derive_encoding

derive获得; 取得; 得到; (使) 起源; (使) 产生;

然后通过help(xlrd.open_workbook),看下这个函数的参数设定解释

Instantiate

执行下面的命令会提示一模一样的错误。

import xlrd

wb = xlrd.open_workbook("D:\\360安全浏览器下载\\PGC_问答站-趋势分析(2020-10-21至2020-11-19).xls")