Issue1793

classification
Title: relative file.seek fails in read/write mode
Type: behaviour Severity: major
Components: Core Versions: Jython 2.7
Milestone:
process
Status: closed Resolution: fixed
Dependencies: Superseder:
Assigned To: Nosy List: fwierzbicki, santa4nt, tuck182, zyasoft
Priority: urgent Keywords: patch

Created on 2011-08-24.23:08:20 by tuck182, last changed 2015-03-29.05:37:26 by zyasoft.

Files
File name Uploaded Description Edit Remove
test-seek.py tuck182, 2011-08-24.23:08:20 test script
test_issue1011.patch santa4nt, 2015-03-20.23:34:35
issue1011.patch santa4nt, 2015-03-21.02:55:32 A possible fix to bring observable behavior up to par with CPython.
issue1011.patch santa4nt, 2015-03-21.18:05:07 Better fix, addressing regression issues.
Messages
msg6613 (view) Author: Matt Tucker (tuck182) Date: 2011-08-24.23:08:20
Doing a file.seek(N, 1) fails for files opened in read/write mode ("r+" or "rb+"). This first manifested for me with mutagen corrupting m4a files when writing tags.


Output from the attached script (on two different systems, Linux and OS-X) gives:

--- OS-X ---
$ python ./script/test-seek.py 
Python 2.6.1
Darwin Kernel Version 10.8.0: Tue Jun  7 16:32:41 PDT 2011; root:xnu-1504.15.3~1/RELEASE_X86_64
No errors

$ jython ./script/test-seek.py 
Jython 2.5.2
Java HotSpot(TM) 64-Bit Server VM, 20.1-b02-384, Apple Inc.
Traceback (most recent call last):
  File "./script/test-seek.py", line 31, in <module>
    assert f.tell() == x, "before read: pos should be %d but was %d" % (x, f.tell())
AssertionError: before read: pos should be 253 but was 254


--- Linux ---
$ python script/test-seek.py 
Python 2.7.1+
#42-Ubuntu SMP Mon Apr 11 03:31:50 UTC 2011
No errors

$ java -jar target/lib/jython-standalone-2.5.2.jar script/test-seek.py 
Jython 2.5.2
OpenJDK Server VM, 20.0-b11, Sun Microsystems Inc.
Traceback (most recent call last):
  File "script/test-seek.py", line 31, in <module>
    assert f.tell() == x, "before read: pos should be %d but was %d" % (x, f.tell())
AssertionError: before read: pos should be 253 but was 254
msg8716 (view) Author: Jim Baker (zyasoft) Date: 2014-06-19.05:47:50
Target beta 4
msg9691 (view) Author: Santoso Wijaya (santa4nt) Date: 2015-03-20.23:34:35
Attaching a unittest test case (that will fail) to reproduce this in regrtest.
msg9692 (view) Author: Santoso Wijaya (santa4nt) Date: 2015-03-21.02:15:39
Huh. Looks like this is caused by optimization in the way we buffer file read operations. fd.read(1) does not strictly read only one byte from the underlying java.nio.channels.FileChannel object.

Jython's fd.read() (where fd is returned by file() built-in function) is dispatched to PyFile#file_read method. Eventually, the read operation is done by calling FileChannel#read(ByteBuffer[] dsts) scattered IO read where the first byte buffer is the one returned to the caller (in this case of size 1), but there exists a secondary byte buffer that will be filled with the rest of the file.

So in the case of the test case above, when we're at, say, position 253 and call fd.read(1), we'll get the 254th byte as read, but the underlying file channel has been read all the way to the end. Hence, fd.seek(-1, SEEK_CUR) having the surprising result.
msg9693 (view) Author: Santoso Wijaya (santa4nt) Date: 2015-03-21.02:55:32
One way I can think of to address this is to introduce a method overloading BufferedIOBase#readinto() which can disable the buffering behavior described above. And then, have #read(int size) use the unbuffered method. I figured this might be appropriate since the caller specifies how many bytes to read explicitly, so we shouldn't assume to much and read beyond that.
msg9696 (view) Author: Jim Baker (zyasoft) Date: 2015-03-21.06:31:47
Santoso, the approach in general looks good, but it's currently failing on a sequence of reads in test_builtin.BuiltinTest.test_open, likely due to some buffering behavior.
msg9697 (view) Author: Santoso Wijaya (santa4nt) Date: 2015-03-21.16:58:17
I see. Thanks for giving it a first look. I'll see what I can do.

On Fri, Mar 20, 2015, 11:31 PM Jim Baker <report@bugs.jython.org> wrote:

>
> Jim Baker added the comment:
>
> Santoso, the approach in general looks good, but it's currently failing on
> a sequence of reads in test_builtin.BuiltinTest.test_open, likely due to
> some buffering behavior.
>
> _______________________________________
> Jython tracker <report@bugs.jython.org>
> <http://bugs.jython.org/issue1793>
> _______________________________________
>
msg9698 (view) Author: Santoso Wijaya (santa4nt) Date: 2015-03-21.18:05:07
Attaching a patched patch.
msg9699 (view) Author: Jim Baker (zyasoft) Date: 2015-03-21.20:18:19
Santoso, thanks, this now looks great and passes the regrtest! Please commit so it can be part of the release candidate, which Frank is just about to build.

I'm very glad we could have this last fix in!
msg9700 (view) Author: Santoso Wijaya (santa4nt) Date: 2015-03-21.21:19:40
Committed with changeset b597f478d5ab.
msg9701 (view) Author: Jim Baker (zyasoft) Date: 2015-03-21.21:36:17
Although the changeset https://hg.python.org/jython/rev/b597f478d5ab was linked to #1011, it's fixed regardless. Marking accordingly.
msg9702 (view) Author: Santoso Wijaya (santa4nt) Date: 2015-03-21.21:41:29
I have no idea how I got #1011. I'm going to push a trivial change changing test_issue1011 method name to test_issue1793.
History
Date User Action Args
2015-03-29 05:37:26zyasoftsetstatus: pending -> closed
2015-03-21 21:41:29santa4ntsetmessages: + msg9702
2015-03-21 21:36:17zyasoftsetpriority: high -> urgent
status: open -> pending
resolution: accepted -> fixed
messages: + msg9701
2015-03-21 21:19:40santa4ntsetmessages: + msg9700
2015-03-21 20:18:19zyasoftsetmessages: + msg9699
2015-03-21 18:05:07santa4ntsetfiles: + issue1011.patch
messages: + msg9698
2015-03-21 17:18:18santa4ntsetfiles: - unnamed
2015-03-21 16:58:18santa4ntsetfiles: + unnamed
messages: + msg9697
2015-03-21 06:31:47zyasoftsetmessages: + msg9696
2015-03-21 02:55:33santa4ntsetfiles: + issue1011.patch
messages: + msg9693
2015-03-21 02:15:40santa4ntsetmessages: + msg9692
2015-03-20 23:34:35santa4ntsetfiles: + test_issue1011.patch
keywords: + patch
messages: + msg9691
2015-03-20 23:09:06santa4ntsetnosy: + santa4nt
versions: + Jython 2.7, - Jython 2.5
2014-06-19 05:47:50zyasoftsetpriority: normal -> high
resolution: accepted
messages: + msg8716
nosy: + zyasoft
2013-02-25 19:22:55fwierzbickisetpriority: normal
nosy: + fwierzbicki
versions: + Jython 2.5, - 2.5.2
2011-08-24 23:08:20tuck182create