Issue2588

classification
Title: test_io failure in concurrent access to a buffered file
Type: behaviour Severity: normal
Components: Library Versions: Jython 2.7
Milestone:
process
Status: open Resolution:
Dependencies: Superseder:
Assigned To: Nosy List: jeff.allen, stefan.richthofer
Priority: Keywords: test failure causes

Created on 2017-05-06.15:00:19 by jeff.allen, last changed 2017-05-09.12:31:27 by stefan.richthofer.

Messages
msg11357 (view) Author: Jeff Allen (jeff.allen) Date: 2017-05-06.15:00:17
A couple of tests in test_io have started failing for me. This is the case since I localised my PC to Chinese, but encoding seems to have nothing to do with the failure.

The test opens one file and creates 20 threads that compete to write in it. There is no explicit synchronisation. The test fails if, at the end, the file does not contain exactly one message from each thread. Comments reference: http://bugs.python.org/issue6750 . Obviously, this is asking for trouble ;)

If I reduce the number of threads below about 10 it passes, so maybe it only ever passed because something was faster, slower or "stickier" than now, effectively serialising access.

The problem arises in CPython because writing releases the GIL in the middle of managing the buffered data. CPython's solution relies on dancing around the moment when the GIL is given up. In the tracker, this appears to be the triumph of pragmatism over purism. The first answer was that it was "not designed to be thread-safe at all".

Of course, we don't have a GIL in Jython, hence I'm actually wondering how this test ever passed, and whether we should expect it to. I'm going to insert a skip for now, referring to this issue.

I'm seeing this in the trunk repository (alongside the errors that *are* to do with encoding):

======================================================================
FAIL: test_threads_write (__main__.CTextIOWrapperTest)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "C:\Users\Jeff\Documents\Eclipse\jython-trunk\dist\Lib\test\test_io.py", line 2460, in test_threads_write
    self.assertEqual(content.count("Thread%03d\n" % n), 1)
AssertionError: 0 != 1

======================================================================
FAIL: test_threads_write (__main__.PyTextIOWrapperTest)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "C:\Users\Jeff\Documents\Eclipse\jython-trunk\dist\Lib\test\test_io.py", line 2460, in test_threads_write
    self.assertEqual(content.count("Thread%03d\n" % n), 1)
AssertionError: 0 != 1

----------------------------------------------------------------------

In my fork, when I've dealt with the encoding problems, I still have these threading failures.
msg11359 (view) Author: Stefan Richthofer (stefan.richthofer) Date: 2017-05-07.09:16:20
Hey Jeff,
would be good if you could share the code where you set up all the threads etc. so it will be more convenient to reproduce this issue.
msg11367 (view) Author: Jeff Allen (jeff.allen) Date: 2017-05-09.06:29:42
@stefan I see how you might have interpreted my last paragraph. I'm referring to this test:
https://hg.python.org/jython/file/tip/Lib/test/test_io.py#l2441
which gets run twice, once with _pyio and once with _jyio (called C here, for "compiled", obviously not referring to any particular language :) So you have the code.

I spotted it while messing with encodings, but I can reproduce it in the trunk, which has none of those changes, and ascii paths. It appeared with the localisation of the PC. (However the failure is there even when I set codepage 1252 at the prompt.) In that state:
>>> import sys
>>> sys.getfilesystemencoding()
>>> sys.stdout.encoding
'cp1252'
>>> from java.lang import System
>>> System.getProperty("file.encoding")
u'GBK'

I get some failures from encoding text files too but I've fixed those in the encoding repo (not pushed). In this localisation, I end up using the _java.py codec, which perhaps accounts for the timing change in the threading test, that shows up the bug.

I raise the issue partly on the principle that I shouldn't insert a skip without doing so -- to park it for later. But also I want to raise the question "should we expect this to work"? I wasn't convinced CPython came to the right conclusion. It seemed to be based on the fact that it was an easy fix in a single-threaded interpreter. If you intended to have a highly-concurrent interpreter, and i/o to match, might you think differently?
msg11368 (view) Author: Stefan Richthofer (stefan.richthofer) Date: 2017-05-09.12:31:27
Sorry, I overlooked that you referred to code from a specific test. Somehow I perceived it like the 20 threads-experiment was your own investigation. On a second read it's clear and I wonder how I could have misinterpreted it earlier.

IMO Jython should do some explicit synchronization here and we cannot expect CPython's "solution" to work. However, it looks like this would mean to put a 'synchronized' or lock into f.write(), maybe even cross sync it with read. Not sure how measurable the impact would be for all the common use cases, which don't use extensive multithreading.
(AFAIK, locks from util.concurrent have much lower impact than synchronized, so this might be acceptable.)
I suppose there must be some state of the art way to have thread save file io in Java.
History
Date User Action Args
2017-05-09 12:31:27stefan.richthofersetmessages: + msg11368
2017-05-09 06:29:43jeff.allensetmessages: + msg11367
2017-05-07 09:16:21stefan.richthofersetnosy: + stefan.richthofer
messages: + msg11359
2017-05-06 15:00:19jeff.allencreate