
文件头ufeff字符
翻译文件po,提示文件有误,最后查到,文件头多了个ufeff字符
比如说对于UTF-16,如果接收者收到的BOM是FEFF,表明这个字节流是Big-Endian的;如果收到FFFE,就表明这个字节流是Little-Endian的。
UTF-8不需要BOM来表明字节顺序,但可以用BOM来表明“我是UTF-8编码”。BOM的UTF-8编码是EF BB BF(用UltraEdit打开文本、切换到16进制可以看到)。所以如果接收者收到以EF BB BF开头的字节流,就知道这是UTF-8编码了。
从此分析,文件编码不对,在windows 中用记录本打开,另存,解决问题
# conding=utf-8
f = open("aa.po", "r",encoding='utf-8')
file = f.read()
file1 = file.split(",")
print(file1)
file2 = file.encode('utf-8').decode('utf-8-sig')
print(file2)
['ufeff试试编码']
试试编码
进程已结束,退出代码 0
# conding=utf-8
f = open("aautf8.txt", "r",encoding='utf-8')
file = f.read()
file1 = file.split(",")
print(file1)
file2 = file.encode('utf-8').decode('utf-8-sig')
print(file2)
['试试编码']
试试编码
进程已结束,退出代码 0
# conding=utf-8
f = open("aaansi.txt", "r",encoding='utf-8')
file = f.read()
file1 = file.split(",")
print(file1)
file2 = file.encode('utf-8').decode('utf-8-sig')
print(file2)
Traceback (most recent call last):
File "D:/odoo141229/调试/filebm.py", line 5, in
file = f.read()
File "D:odsoftpython37libcodecs.py", line 322, in decode
(result, consumed) = self._buffer_decode(data, self.errors, final)
UnicodeDecodeError: 'utf-8' codec can't decode byte 0xca in position 0: invalid continuation byte
欢迎分享,转载请注明来源:内存溢出
微信扫一扫
支付宝扫一扫
评论列表(0条)