Issue1754

classification
Title: modjy does not provide appropriate wsgi.input file-like object
Type: Severity: normal
Components: Library Versions: 2.5.2rc
Milestone:
process
Status: closed Resolution: fixed
Dependencies: Superseder:
Assigned To: amak Nosy List: amak, fwierzbicki, jeff250, pjenvey, sedibr, ssweens
Priority: high Keywords: patch

Created on 2011-06-03.01:26:06 by jeff250, last changed 2012-11-05.23:46:10 by amak.

Files
File name Uploaded Description Edit Remove
test.py jeff250, 2011-06-03.06:36:25 Demonstrative case
issue1754.diff pjenvey, 2012-06-28.19:01:26
unnamed amak, 2012-11-05.00:08:27
Messages
msg6538 (view) Author: Jeffrey Knockel (jeff250) Date: 2011-06-03.01:26:06
By thinly wrapping a Java InputStream for wsgi.input, the wsgi.input's file.read(size) method can unblock and return "early."  Java's InputStream.read() is allowed to return less than the requested # of bytes under non-exceptional/EOF conditions, whereas wsgi.input requires the file.read() method to block until the requested # of bytes are available (except for exceptional/EOF conditions).

From https://code.djangoproject.com/ticket/13756:
----------
PEP 3333 says that wsgi.input is:
> An input stream (file-like object) from which the HTTP request body
> bytes can be read. (The server or gateway may perform reads on-demand
> as requested by the application, or it may pre- read the client's
> request body and buffer it in-memory or on disk, or use any other
> technique for providing such an input stream, according to its
> preference.)

It adds that:

> The semantics of each method are as documented in the Python Library
> Reference

In the Python Library Reference, the semantic of [file-like object].read() is:
> Read at most size bytes from the file (less if the read hits EOF
> before obtaining size bytes).

This means that:
  * WSGI does require a wsgi.input file object that blocks on reads until a buffer of exactly the requested length is returned, or EOF / Content-Length is reached,
  * the bug is in modjy, which does not provide an appropriate file-like object right now.
----------
msg6539 (view) Author: Jeffrey Knockel (jeff250) Date: 2011-06-03.06:36:25
I'm attaching a small demonstrative case that builds a Python file-like object using the same call that modjy does.  The call is passed an InputStream that never returns more than one byte per read().
msg6543 (view) Author: Alan Kennedy (amak) Date: 2011-06-05.12:59:38
I am away on vacation, will look at this when I get back.

A related issue is

https://code.djangoproject.com/ticket/13756

Although I see you've already commented on that.
msg6555 (view) Author: Alan Kennedy (amak) Date: 2011-06-18.14:19:25
I believe that modjy is doing the right thing by falling FileUtil.wrap to wrap the ServletInputStream into a python compatible file.

If the resulting file should not return untl all of the requested bytes are available, then that is a matter for the PyFile object returned from the wrap call.

Need input from pjenvey on this: he is the designer and implementer of the IO subsystem.

If Philip says that modjy has to change its treatment of input, then I will change modjy.
msg6820 (view) Author: Alan Kennedy (amak) Date: 2012-03-19.17:41:33
Philip, any input on this one?
msg7095 (view) Author: Philip Jenvey (pjenvey) Date: 2012-05-15.21:34:47
InputStream.read actually *does* block.

Our file wrapper reads this InputStream via an nio Channel wrapper. It looks like this wrapper will only make one read (and thus not block) if the wrapped InputStream.available() returns 0.

The wrapper reads in 8192 byte sized chunks. I suppose this means the bug would be avoided entirely if Content-Length was smaller than that?

Python 2 *and* 3's socket.makefile definitely block. Since we're more based on Python 3's impl here, the fix should be based on its behavior.

So Python 3's SocketIO (the basis for Python 3's socket.makefile) seems to be the place that does the actual blocking read (in its readinto method). That method apparently specially handles blocking and non-blocking modes. We need to be consider that along with the strange behavior of the nio Channel wrapper for our fix
msg7207 (view) Author: Philip Jenvey (pjenvey) Date: 2012-06-12.22:41:17
Actually I'm wrong about SocketIO being involved here, because we're passed a plain InputStream (so StreamIO is used).

Here's someone else describing the Channel wrapper behavior as 'weird': http://stackoverflow.com/questions/1894727/javas-readablebytechannelimpl-inconsistent-behaviour
msg7212 (view) Author: Jeffrey Knockel (jeff250) Date: 2012-06-13.07:33:19
We wrap with StreamIO, but this wraps with Channels.newChannel(), whose documentation states:

"The resulting channel will not be buffered; it will simply redirect its I/O operations to the given stream. Closing the channel will in turn cause the stream to be closed."

(This is stronger than ReadableByteChannel's general contract, which states that read() "might not fill the buffer, and in fact it might not read any bytes at all.")

Thus, if we pass an InputStream that can return "early" with fewer than the requested bytes (which is allowed by InputStream's read() contract), then the resulting channel from Channels.newChannel() will return "early" too.
msg7272 (view) Author: Philip Jenvey (pjenvey) Date: 2012-06-28.19:01:26
Here's a potential fix. We're replacing the Channels.newChannel wrapper with our own that will do more than one blocking read

The commented out read method doesn't quite do it, we need potentially multiple blocking reads to suffice the entire requested size in some cases

It might be worthwhile to make the initial TRANSFER_SIZE value equal to 'len' there.

I don't know why the the original wrapper is synchronizing the read. Buffers aren't thread safe. The Channels.newChannel wrapper is strange =\
msg7273 (view) Author: Philip Jenvey (pjenvey) Date: 2012-06-28.19:04:29
And the patch obviously needs some cleanup
msg7274 (view) Author: Frank Wierzbicki (fwierzbicki) Date: 2012-06-28.20:02:46
Thanks for the update Philip!
msg7311 (view) Author: Frank Wierzbicki (fwierzbicki) Date: 2012-07-05.16:49:52
I have pushed a potential fix for this - I'll be testing it soon with a colleague at Adconion where this bug is a problem (on Tomcat). If it fixes the Tomcat issue I'll close this - though please double check me if you are nosy on this bug :)
msg7345 (view) Author: Alan Kennedy (amak) Date: 2012-08-02.23:56:29
I see this bug is marked as fixed in the NEWS file.

I will try to add a unit test for it over the weekend.
msg7346 (view) Author: Frank Wierzbicki (fwierzbicki) Date: 2012-08-03.16:26:11
Alan: I integrated test.py, but tomcat still had problems. I have to admit I just used tomcat directly for the final testing and did not attempt to isolate a unit test.
msg7390 (view) Author: Sergio Almeida Dias (sedibr) Date: 2012-08-13.14:33:34
We have tested the issue today and it seems to solve the problem that prevent our program to receive data from POST messages bigger than 2k (approx).

We have tested with version 2.5.3rc1, following recommendation of people from django-jython forum. Our target environment includes: Jython 2.5.3rc1 + Django (1.3) + django-jython (1.3) + Jetty 7

See issue 75 for more details on testing conditions:
http://code.google.com/p/django-jython/issues/detail?id=75
msg7392 (view) Author: Frank Wierzbicki (fwierzbicki) Date: 2012-08-13.16:00:56
Sergio: thanks for the info - I'll leave this issue open for another month or so to see if I or someone else finds the motivation to make a unit test for it.
msg7461 (view) Author: Alan Kennedy (amak) Date: 2012-09-16.16:37:00
In the light of further issues caused by fixing this one, i.e.

Can only read 8192 bytes
http://bugs.jython.org/issue1967

I think the optimal solution is to fix this in modjy, which knows that it is dealing with ServletInputSream, and can special case behaviour accordingly.
msg7501 (view) Author: Alan Kennedy (amak) Date: 2012-11-04.14:41:08
I have checked in a completely rewritten implementation of the modjy wsgi.input object here

2.5: http://hg.python.org/jython/rev/4c3155645812
tip: http://hg.python.org/jython/rev/7ebb51401f54

I now consider this issue fixed, with a completely different fix to that originally proposed.

The original fix gave rise to too many problems in other areas, and should probably now be undone.
msg7505 (view) Author: Frank Wierzbicki (fwierzbicki) Date: 2012-11-04.18:57:28
Alan: great! This, with the reversion of the problematic fix, should be enough to trigger a 2.5.4 release - I'll be sure to test this at work where this specific problem bites us. If all goes well, do you have other things you'd like to see in a 2.5.4?
msg7506 (view) Author: Alan Kennedy (amak) Date: 2012-11-05.00:08:28
Hi Frank.

Yes, it's good to see these issues resolved.

There are a few othe issues that could possibly be included, e.g. locale or
console stuff. But those issues can wait for a 2.5.5, IMHO.

Whereas not releasing these fixes will cause problems for users, and
probably result in them filing more bugs. So I vote for: 2.5.4, the sooner
the better.
On Nov 4, 2012 6:57 PM, "Frank Wierzbicki" <report@bugs.jython.org> wrote:

>
> Frank Wierzbicki added the comment:
>
> Alan: great! This, with the reversion of the problematic fix, should be
> enough to trigger a 2.5.4 release - I'll be sure to test this at work where
> this specific problem bites us. If all goes well, do you have other things
> you'd like to see in a 2.5.4?
>
> _______________________________________
> Jython tracker <report@bugs.jython.org>
> <http://bugs.jython.org/issue1754>
> _______________________________________
>
msg7508 (view) Author: Alan Kennedy (amak) Date: 2012-11-05.23:46:09
BTW, although I am confident that this fix will be robust, since it based on code that has been running in the field for several years, I am still interested to read reports of success or failure.
History
Date User Action Args
2012-11-05 23:46:10amaksetmessages: + msg7508
2012-11-05 00:08:28amaksetfiles: + unnamed
messages: + msg7506
2012-11-04 18:57:29fwierzbickisetmessages: + msg7505
2012-11-04 14:41:10amaksetstatus: open -> closed
resolution: fixed
messages: + msg7501
2012-09-16 16:37:01amaksetassignee: fwierzbicki -> amak
messages: + msg7461
2012-08-13 16:00:56fwierzbickisetmessages: + msg7392
2012-08-13 16:00:27fwierzbickisetmessages: - msg7391
2012-08-13 15:42:41fwierzbickisetmessages: + msg7391
2012-08-13 14:33:35sedibrsetnosy: + sedibr
messages: + msg7390
2012-08-03 16:26:12fwierzbickisetmessages: + msg7346
2012-08-02 23:56:30amaksetmessages: + msg7345
2012-07-05 16:49:53fwierzbickisetmessages: + msg7311
2012-06-29 20:18:43fwierzbickisetassignee: pjenvey -> fwierzbicki
2012-06-28 20:02:46fwierzbickisetmessages: + msg7274
2012-06-28 19:04:29pjenveysetmessages: + msg7273
2012-06-28 19:01:27pjenveysetfiles: + issue1754.diff
keywords: + patch
messages: + msg7272
2012-06-13 07:33:20jeff250setmessages: + msg7212
2012-06-12 22:41:18pjenveysetmessages: + msg7207
2012-05-15 21:34:47pjenveysetmessages: + msg7095
2012-05-10 02:56:59pjenveysetpriority: normal -> high
assignee: amak -> pjenvey
nosy: + fwierzbicki
2012-03-19 17:41:33amaksetpriority: normal
messages: + msg6820
2011-09-26 23:13:11ssweenssetnosy: + ssweens
2011-06-18 14:19:25amaksetmessages: + msg6555
2011-06-18 14:15:41amaksetnosy: + pjenvey
2011-06-05 12:59:38amaksetassignee: amak
messages: + msg6543
nosy: + amak
2011-06-03 06:36:25jeff250setfiles: + test.py
messages: + msg6539
2011-06-03 01:26:06jeff250create