python - "Unicode error" when reading a file -


this first post on here, don't hope isn't in wrong topic or something, i've run unusual problem python app i'm writing.

basically, i'm trying read text file , insert part of tkinter text widget. text file contains usual "\n" line breaks, when run code bizarre error haven't been able cook workaround for:

(btw, sorry lousy set-up here... not sure how work new code-entering system; seems "play own rules" , have own syntax, copied/pasted below:

    exception in tkinter callback traceback (most recent call last):   file "c:\python33\lib\idlelib\run.py", line 107, in main     seq, request = rpc.request_queue.get(block=true, timeout=0.05)   file "c:\python33\lib\queue.py", line 175, in     raise empty queue.empty 

during handling of above exception, exception occurred:

traceback (most recent call last):   file "c:\python33\lib\tkinter\__init__.py", line 1442, in __call__     return self.func(*args)   file "c:\users\owner\desktop\python projects\the ultimate joke book.py", line 89, in search     results.create()   file "c:\users\owner\desktop\python projects\the ultimate joke book.py", line 31, in create     joke = linecache.getline('jokes/jokelist.txt',x)   file "c:\python33\lib\linecache.py", line 15, in getline     lines = getlines(filename, module_globals)   file "c:\python33\lib\linecache.py", line 41, in getlines     return updatecache(filename, module_globals)   file "c:\python33\lib\linecache.py", line 127, in updatecache     lines = fp.readlines()   file "c:\python33\lib\codecs.py", line 300, in decode     (result, consumed) = self._buffer_decode(data, self.errors, final) unicodedecodeerror: 'utf-8' codec can't decode byte 0xbf in position 627: invalid start byte 

so function caused problem -- "linecache.getline" used in for loop -- works when there no "\" in text, whatever reason doesn't "\" , starts spittin' errors. : /

so tonight i've spent hour on "docs" (http://docs.python.org/3/howto/unicode.html), reading history , basic concept of unicode, loaded assumed knowledge , while informative , helpful on concept-only level, didn't seem in terms of practical information , potential solutions.

the solution can come defeat annoying little bug use "/n" instead , programmatically split strings array (or "list" seem called in python), use loop break more 1 line... sounds lot of unnecessary steps, if there common workaround in existence. appreciate insights on how solve particularly mysterious problem.

thanks.

the data utf-8 decoder has been given not utf-8. that's why error. need give code fails , data examples explain happening.

the character in question "¿" in latin-1 , cp-1252. perhaps spanish text written on windows machine? in case, specify encoding when opening file.


Comments

Popular posts from this blog

blackberry 10 - how to add multiple markers on the google map just by url? -

php - guestbook returning database data to flash -

delphi - Dynamic file type icon -