Title: Jython non-blocking socket send() does not conform to Python's behavior.
Type: behaviour Severity: normal
Components: Versions: Jython 2.7
Status: closed Resolution: fixed
Dependencies: Superseder:
Assigned To: Nosy List: behackett, nickmbailey, ryan.springer, zyasoft
Priority: Keywords:

Created on 2016-07-05.15:48:27 by ryan.springer, last changed 2017-08-25.03:02:47 by behackett.

File name Uploaded Description Edit Remove
nonblocking-example.tar ryan.springer, 2016-07-08.15:09:19 Example to demonstrate this issue.
msg10873 (view) Author: Ryan Springer (ryan.springer) Date: 2016-07-05.15:50:43
We have noticed a problem that occurs sometimes when using non-blocking sockets with Jython.  Our application will call send() on a socket and then check the amount of bytes that are returned to see if the data has been sent.  If the number of bytes returned matched the number of bytes sent, then close() is called on the socket.  This logic works without problems when using cPython.  However, in Jython, if Netty has not finished delivering the data, then the close() will cause Netty to throw away whatever bytes have not been delivered yet.  Our application is unaware that this situation has occurred, and the client of the application receives a truncated response.

We can only reproduce this problem when the application and our client are communication over the slower network conditions of a VPN.  Normally Netty delivers the data before our application calls close() on the socket.

There is a FIXME in the send() method in Lib/ that reads:

  # FIXME are we sure we are going to be able to send this much data, especially async?
  return len(data)

I wanted to make sure that I understood the behavior of send() when used with non-blocking sockets, so I reviewed the Python documentation and the Posix specification for send() and write().  Two details in the Posix specification seem related to this situation:

1) If no flags are passed to send(), it is equivalent to write().  Jython is currently ignoring the flags that are passed to send().

    If the socket argument refers to a socket and the flags argument is 0, the send() function is equivalent to write().


2) write() will return the number of bytes written, even in non-blocking mode:

    If the O_NONBLOCK flag is set, write() shall not block the thread. If some data can be written without blocking the thread, write() shall write what it can and return the number of bytes written. Otherwise, it shall return -1 and set errno to EAGAIN.


This is the Posix non-blocking socket behavior.  For Python non-blocking sockets, the behavior is the same.  send() will return the number of bytes actually sent.

See: - The description of send() says "Returns the number of bytes sent. " 
also see:

I did not see a way to easily implement the Posix behavior using the Netty API.  It might be possible to maintain some state concerning sockets that have not delivered all of their data so that a socket close() could internally delay calling close on the Netty channel until the data was fully delivered.
msg10874 (view) Author: Ryan Springer (ryan.springer) Date: 2016-07-08.15:09:18
nonblocking-example.tar contains a simple example that should demonstrate this problem if 250ms or more of network latency is present.  is a simplistic server and is a client.  These scripts should be run on separate physical or virtual machines. There is a README file that is included with instructions for setup and for simulating network delay in linux using the "tc" command.  If the problem is triggered, then the downloaded file will be truncated.
msg10880 (view) Author: Jim Baker (zyasoft) Date: 2016-07-26.05:35:06
I thought about this behavior, but since this was not addressed by tests, or the docs I looked at, I simply let it go with the FIXME. Thanks for looking into this!

I had seen similar VPN issues with sockets in the past (on old versions of Windows at least), but had not considered this could be due to socket closing. It may have been a similar issue to what we are seeing here - Java sometimes reproduces lowest-common Windows behavior in other corner cases.
msg10881 (view) Author: Jim Baker (zyasoft) Date: 2016-07-26.06:37:44
Most likely the solution here is to prevent early closes by validating data has been flushed, via flush notifications.

My reading of C socket semantics is that send simply means the data was copied over to the kernel for actual network send; this seems no different than it being copied over to Netty. But perhaps we can upgrade to 4.1 (perhaps easy); and use this API to get a more reliable estimate than the blind try to send all bytes into Netty that is done now.
msg10882 (view) Author: Nick (nickmbailey) Date: 2016-07-29.21:54:00
So to make sure I'm understanding the approach correctly. The idea is that we call bytesBeforeUnwritable to see how many bytes netty will be able to immediately send, call writeAndFlush with that many bytes and return that length to the caller immediately expecting them to call send() again if not all bytes were sent. Right?
msg10889 (view) Author: Nick (nickmbailey) Date: 2016-08-01.21:58:03
I've got a pr up against master implementing the logic described above:
msg10890 (view) Author: Nick (nickmbailey) Date: 2016-08-01.22:07:07
Also, the netty upgrade and the fix to send() are split into separate commits for (hopefully) easier reviewing.
msg10896 (view) Author: Jim Baker (zyasoft) Date: 2016-08-17.04:41:48
Fixed as of, using the patch provided by Nick in the PR.
msg11543 (view) Author: Bernie Hackett (behackett) Date: 2017-08-25.00:22:16
Since socket.sendall is just aliased to socket.send, this change breaks socket.sendall, which is supposed to block until all data has been sent or an error occurs.

I would imagine this will break a lot of libraries and applications that expect to not have to call socket.sendall repeatedly until all data has been sent.
msg11544 (view) Author: Jim Baker (zyasoft) Date: 2017-08-25.00:31:23
@Bernie, this is a good point, but let's open as a new bug, and then figure out a good test case. When I wrote the implementation of send, and aliased sendall to it, it was not clear to me that Netty is not doing this chunking for us, but it's something that can be readily investigated.
msg11545 (view) Author: Bernie Hackett (behackett) Date: 2017-08-25.01:06:19
@Jim, I'll try to come up with a standalone reproduction script (shouldn't be difficult) and open another ticket. I discovered this when PyMongo's test suite locked up when run under Jython 2.7.1. sendall definitely no longer sends all under certain circumstances.
msg11547 (view) Author: Bernie Hackett (behackett) Date: 2017-08-25.03:02:47
Date User Action Args
2017-08-25 03:02:47behackettsetmessages: + msg11547
2017-08-25 01:06:19behackettsetmessages: + msg11545
2017-08-25 00:31:24zyasoftsetmessages: + msg11544
2017-08-25 00:22:17behackettsetnosy: + behackett
messages: + msg11543
2016-09-06 00:19:37zyasoftsetstatus: pending -> closed
2016-08-17 04:41:48zyasoftsetstatus: open -> pending
resolution: fixed
messages: + msg10896
2016-08-01 22:07:07nickmbaileysetmessages: + msg10890
2016-08-01 21:58:03nickmbaileysetmessages: + msg10889
2016-07-29 21:54:01nickmbaileysetnosy: + nickmbailey
messages: + msg10882
2016-07-26 06:37:44zyasoftsetmessages: + msg10881
2016-07-26 05:35:07zyasoftsetmessages: + msg10880
2016-07-25 23:12:30zyasoftsetnosy: + zyasoft
2016-07-08 15:09:19ryan.springersetfiles: + nonblocking-example.tar
messages: + msg10874
2016-07-05 15:50:45ryan.springersetmessages: + msg10873
2016-07-05 15:48:27ryan.springercreate