Message7987
Byte-strings can contain elements that aren't bytes. The problem is easily reproduced, like this:
$ jython
Jython 2.5.2 (Debian:hg/91332231a448, May 8 2012, 09:50:46)
[OpenJDK 64-Bit Server VM (Sun Microsystems Inc.)] on java1.6.0_27
>>> foo = str(java.lang.String(u"\u1234"))
>>> print foo
?
>>> foo
'\u1234'
I can't say I know what the proper solution to this problem would be, but it seems strange that byte-strings should be able to contain non-byte elements.
It also seems like a bug in itself that the repr() representation of such an object does not reproduce the same object when eval'ed:
>>> eval(repr(foo))
'\\u1234'
It is also worth noting that such strings are poison even to Unicode codecs that should be able to handle any bytestring without choking:
>>> unicode(foo, "latin1")
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
UnicodeDecodeError: 'latin-1' codec can't decode byte 0x34 in position 0: ordinal not in range(256)
Perhaps str() should raise an exception when such objects would be created? |
|
Date |
User |
Action |
Args |
2013-04-06 03:02:20 | Dolda2000 | set | recipients:
+ Dolda2000 |
2013-04-06 03:02:20 | Dolda2000 | set | messageid: <1365217340.57.0.797634890825.issue2037@psf.upfronthosting.co.za> |
2013-04-06 03:02:20 | Dolda2000 | link | issue2037 messages |
2013-04-06 03:02:19 | Dolda2000 | create | |
|