Issue2444

classification
Title: Issue on type coercion between bytes/char/string/int (Bytes class)
Type: behaviour Severity: major
Components: Versions: Jython 2.7
Milestone:
process
Status: open Resolution:
Dependencies: Superseder:
Assigned To: Nosy List: lsenta, zyasoft
Priority: Keywords:

Created on 2015-12-17.11:49:46 by lsenta, last changed 2015-12-17.16:16:17 by zyasoft.

Messages
msg10544 (view) Author: Laurent Senta (lsenta) Date: 2015-12-17.11:49:44
First of all, thanks for maintaining this great project!

On Jython 2.7.0:

I have a weird behaviour using Bytes class in hbase-common-1.0.0-cdh5.4.5.jar.

>>> qualifier
array('b', [98, 121, 116, 101, 45, 115, 105, 122, 101])
>>> Bytes.toString(qualifier)
u'[B@30f88a41'
>>> Bytes.toString(qualifier, 0, len(qualifier))
u'byte-size'
>>> Bytes.toString(str(qualifier))
u"array('b', [98, 121, 116, 101, 45, 115, 105, 122, 101])"
>>> Bytes.toString(repr(qualifier))
u"array('b', [98, 121, 116, 101, 45, 115, 105, 122, 101])"

The correct result is 'byte-size', the issue is the '[B@30f88a41' result.

Documentation in http://archive.cloudera.com/cdh5/cdh/5/hbase-1.0.0-cdh5.4.5/apidocs/index.html



Another thing, if I call:
>>> Bytes.toBytes('v')

The toBytes(String) implementation is called,
however the documentation at: https://wiki.python.org/jython/UserGuide#id13
seem's to say that 'v' should be coerced into a char thus call the toBytes(int) implementation.

TBH I rely on this 'v' coercion to string so that's not really an issue
but I wonder when coercion into char happens and if the doc should be updated.


Thanks for taking a look,
msg10545 (view) Author: Jim Baker (zyasoft) Date: 2015-12-17.16:16:16
Most likely the problem with Bytes.toString(byte[]) is caused by #1002, since Bytes.toInt(byte[]), Bytes.toHex(byte[]), etc, works:

>>> import array
>>> qualifier = array.array('b', [98, 121, 116, 101, 45, 115, 105, 122, 101])
>>> qualifier
array('b', [98, 121, 116, 101, 45, 115, 105, 122, 101])
>>> Bytes.toHex(qualifier)
u'627974652d73697a65'

which we can verify

>>> [hex(b) for b in qualifier]
['0x62', '0x79', '0x74', '0x65', '0x2d', '0x73', '0x69', '0x7a', '0x65']

Workaround for now:

>>> qualifier.tostring() # note case!
'byte-size'

re choice of the overloading for Bytes.toBytes(java.lang.String) vs Bytes.toBytes(char), I don't think this works if both overloads are available. Perhaps we should make that possible, but that seems like a possible breaking change. So the docs on the wiki should be updated.

However, you can control the specific overloaded method that is selected like so:

>>> from java.lang import Short
>>> Bytes.toBytes(Short(ord("v")))
array('b', [0, 118])

You could also do this, but in this specific library there is no overloading for Character. Nor is there an implicit conversion to short in Jython, although that would be problematic given that char is unsigned and short is signed.

>>> from java.lang import Character
>>> Bytes.toBytes(Character("v"))
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: toBytes(): 1st arg can't be coerced to boolean, long, int, double, short, java.nio.ByteBuffer, float, java.math.BigDecimal, String
History
Date User Action Args
2015-12-17 16:16:17zyasoftsetnosy: + zyasoft
messages: + msg10545
2015-12-17 11:49:46lsentacreate