Issue2429

classification
Title: cStringIO does not work with mutable objects implementing the buffer protocol
Type: Severity: normal
Components: Versions:
Milestone: Jython 2.7.2
process
Status: open Resolution: accepted
Dependencies: Superseder:
Assigned To: zyasoft Nosy List: pekka.klarck, zyasoft
Priority: high Keywords: patch

Created on 2015-11-12.23:53:10 by pekka.klarck, last changed 2016-02-15.18:52:21 by zyasoft.

Files
File name Uploaded Description Edit Remove
cStringIO-copy-arrays.diff zyasoft, 2016-02-15.18:49:55
Messages
msg10467 (view) Author: Pekka Klärck (pekka.klarck) Date: 2015-11-12.23:53:08
To reprocude:

1) First start XML-RPC server:

Jython 2.7.0 (default:9987c746f838, Apr 29 2015, 02:25:11) 
[Java HotSpot(TM) 64-Bit Server VM (Oracle Corporation)] on java1.8.0_66
Type "help", "copyright", "credits" or "license" for more information.
>>> from SimpleXMLRPCServer import SimpleXMLRPCServer
>>> server = SimpleXMLRPCServer(('127.0.0.1', 12345))
>>> def func(arg):
...   return repr(arg.data)
... 
>>> server.register_function(func)
>>> server.serve_forever()

2) Use the server with Binary wrapping bytes (i.e. str) and bytearray:

Jython 2.7.0 (default:9987c746f838, Apr 29 2015, 02:25:11) 
[Java HotSpot(TM) 64-Bit Server VM (Oracle Corporation)] on java1.8.0_66
Type "help", "copyright", "credits" or "license" for more information.
>>> import xmlrpclib
>>> proxy = xmlrpclib.ServerProxy('http://127.0.0.1:12345')
>>> proxy.func(xmlrpclib.Binary(b'foo'))    # works fine
"'foo'"
>>> proxy.func(xmlrpclib.Binary(bytearray(b'foo')))    # bang!!
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/home/peke/Prog/jython2.7.0/Lib/xmlrpclib.py", line 1224, in __call__
    return self.__send(self.__name, args)
  File "/home/peke/Prog/jython2.7.0/Lib/xmlrpclib.py", line 1571, in _ServerProxy__request
    request = dumps(params, methodname, encoding=self.__encoding,
  File "/home/peke/Prog/jython2.7.0/Lib/xmlrpclib.py", line 1085, in dumps
    data = m.dumps(params)
  File "/home/peke/Prog/jython2.7.0/Lib/xmlrpclib.py", line 632, in dumps
    dump(v, write)
  File "/home/peke/Prog/jython2.7.0/Lib/xmlrpclib.py", line 654, in _Marshaller__dump
    f(self, value, write)
  File "/home/peke/Prog/jython2.7.0/Lib/xmlrpclib.py", line 752, in dump_instance
    value.encode(self)
  File "/home/peke/Prog/jython2.7.0/Lib/xmlrpclib.py", line 506, in encode
    base64.encode(StringIO.StringIO(self.data), out)
TypeError: StringIO(): 1st arg can't be coerced to String, org.python.core.PyArray
msg10468 (view) Author: Pekka Klärck (pekka.klarck) Date: 2015-11-12.23:55:27
A workaround is converting the bytearray to bytes before wrapping it like `xmlrpc.Binary(bytes(bytearray(b'foo')))`.
msg10720 (view) Author: Jim Baker (zyasoft) Date: 2016-02-04.01:51:19
Changed to underlying root cause.
msg10742 (view) Author: Jim Baker (zyasoft) Date: 2016-02-15.18:49:55
We are going to have to defer this to 2.7.2. The basic problem is as follows:

CPython implements the Python types cStringIO.StringI and cStringIO.StringO, with cStringIO being a factory function that selects the appropriate type. Jython, in contrast, currently has one common class, StringIO and allows for mixing input and output by using an underlying StringBuilder. In the most common case, output ("StringO"), there is no distinction in semantics.

But for the "StringI" case, and this can be seen with bytearray, we need to implement StringI's support for taking a buffer protocol argument and allowing mutation on the underlying data. So here's what can happen:

$ python
Python 2.7.10 (v2.7.10:15c95b7d81dc, May 23 2015, 09:33:12)
[GCC 4.2.1 (Apple Inc. build 5666) (dot 3)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> from cStringIO import StringIO
>>> x = bytearray("foobar")
>>> s = StringIO(x)
>>> s.getvalue()
'foobar'
>>> x[0] = "F"
>>> s.getvalue()
'Foobar'
>>> s
<cStringIO.StringI object at 0x1006ba4f8>

In addition, StringI only supports reads, which makes sense since it's tied to the underlying object's backing data via the buffer protocol:

>>> s.write("baz")
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
AttributeError: 'cStringIO.StringI' object has no attribute 'write'

In working out this issue, I did come up with a preliminary patch to further extend these semantics for partial bytearray approach, which works by doing a copy of the bytearray data. But we need to implement buffer protocol semantics instead.

In 2.7.2, we can do this properly: complete rewrite of the cStringIO module (which is minimal work, it is about 450 lines) and implement the Python types StringI and StringO with correct Python semantics, using Jython's standard expose approach. However, it's too late in the release cycle of 2.7.1 to do this work now.
History
Date User Action Args
2016-02-15 18:52:21zyasoftsettitle: cStringIO does not work with objects implementing the buffer protocol -> cStringIO does not work with mutable objects implementing the buffer protocol
2016-02-15 18:49:57zyasoftsetfiles: + cStringIO-copy-arrays.diff
keywords: + patch
title: cStringIO does not work with bytearray objects -> cStringIO does not work with objects implementing the buffer protocol
messages: + msg10742
milestone: Jython 2.7.1 -> Jython 2.7.2
2016-02-09 01:09:56zyasoftsetpriority: high
assignee: zyasoft
resolution: accepted
milestone: Jython 2.7.1
2016-02-04 01:51:20zyasoftsetnosy: + zyasoft
messages: + msg10720
title: Sending bytearray wrapped to xmlrpc.Binary over XML-RPC fails -> cStringIO does not work with bytearray objects
2015-11-12 23:55:27pekka.klarcksetmessages: + msg10468
2015-11-12 23:53:10pekka.klarckcreate