Issue1483

classification

Title:	optparse std module dies on non-ASCII unicode data
Type:	behaviour	Severity:	normal
Components:	Library	Versions:	2.5.1
		Milestone:

process

Status:	closed	Resolution:	fixed
Dependencies:		Superseder:
Assigned To:		Nosy List:	dvska, pjenvey
Priority:		Keywords:

Created on 2009-10-01.16:50:56 by dvska, last changed 2010-04-11.17:37:48 by pjenvey.

Files
File name	Uploaded	Description	Edit	Remove
optparse_index_out_of_range_0_test.py	dvska, 2009-10-01.16:50:55	tiny test

Messages
msg5215 (view)	Author: dvska (dvska)	Date: 2009-10-01.16:50:55
please run an attached file, result is Traceback (most recent call last): File "optparse_index_out_of_range_0_test.py", line 19, in <module> parser.print_help() File "C:\jython25\Lib\optparse.py", line 1657, in print_help file.write(self.format_help().encode(encoding, "replace")) File "C:\jython25\Lib\optparse.py", line 1637, in format_help result.append(self.format_option_help(formatter)) File "C:\jython25\Lib\optparse.py", line 1617, in format_option_help result.append(OptionContainer.format_option_help(self, formatter)) File "C:\jython25\Lib\optparse.py", line 1066, in format_option_help result.append(formatter.format_option(option)) File "C:\jython25\Lib\optparse.py", line 303, in format_option result.append("%*s%s\n" % (indent_first, "", help_lines[0])) IndexError: index out of range: 0
msg5629 (view)	Author: Philip Jenvey (pjenvey)	Date: 2010-04-04.18:38:03
The problem here is textwrap.wrap doesn't handle the unicode input correctly: Python 2.5.4 (r254:67916, Jul 7 2009, 23:51:24) [GCC 4.2.1 (Apple Inc. build 5646)] on darwin Type "help", "copyright", "credits" or "license" for more information. >>> import textwrap >>> text = u'\u0443\u043a\u0430\u0437\u0430\u0442\u044c \u043f\u0440\u0438\u0447\u0438\u043d\u0443 \u0438\u0437\u043c\u0435\u043d\u0435\u043d\u0438\u044f' >>> textwrap.wrap(text, 54) [u'\u0443\u043a\u0430\u0437\u0430\u0442\u044c \u043f\u0440\u0438\u0447\u0438\u043d\u0443 \u0438\u0437\u043c\u0435\u043d\u0435\u043d\u0438\u044f'] Jython 2.5.1+ (trunk:6995:6999M, Apr 4 2010, 11:01:22) [Java HotSpot(TM) 64-Bit Server VM (Apple Inc.)] on java1.6.0_17 Type "help", "copyright", "credits" or "license" for more information. >>> import textwrap >>> text = u'\u0443\u043a\u0430\u0437\u0430\u0442\u044c \u043f\u0440\u0438\u0447\u0438\u043d\u0443 \u0438\u0437\u043c\u0435\u043d\u0435\u043d\u0438\u044f' >>> textwrap.wrap(text, 54) [] textwrap heavily relies on regular expressions, so I'm going to guess this is a bug in the re module in dealing with unicode input
msg5660 (view)	Author: Philip Jenvey (pjenvey)	Date: 2010-04-11.17:37:48
The problem was actually in the unicode.translate method. This was fixed in r7017, thanks

History
Date	User	Action	Args
2010-04-11 17:37:49	pjenvey	set	status: open -> closed resolution: fixed messages: + msg5660
2010-04-04 18:38:03	pjenvey	set	nosy: + pjenvey messages: + msg5629
2009-10-01 16:50:56	dvska	create