Issue2358

classification
Title: Using "read all" ops on /proc files on Linux produces empty strings
Type: Severity: normal
Components: Core Versions: Jython 2.7, Jython 2.5
Milestone: Jython 2.7.1
process
Status: closed Resolution: fixed
Dependencies: Superseder:
Assigned To: zyasoft Nosy List: Arfrever, dstromberg, jdemoor, zyasoft
Priority: Keywords:

Created on 2015-05-21.23:13:13 by dstromberg, last changed 2016-01-06.16:00:02 by zyasoft.

Messages
msg10074 (view) Author: Dan Stromberg (dstromberg) Date: 2015-05-21.23:13:13
This file has content (it's not just 0 length), and is readable from CPython and Pypy using open, and is readable in Jython 2.7.0 using os.open - but:

$ /usr/local/jython-2.7.0/bin/jython
Jython 2.7.0 (default:9987c746f838, Apr 29 2015, 02:25:11)
[OpenJDK 64-Bit Server VM (Oracle Corporation)] on java1.7.0_65
Type "help", "copyright", "credits" or "license" for more information.
>>> import os
>>> pid = os.getpid()
>>> file_ = open('/proc/{}/status'.format(pid), 'r')
>>> file_.read()
''
>>> file_.close()
>>>

Using strace, I see Jython open the file as #15 and then:
2595  dup2(16, 15)                      = 15
2595  close(15)                         = 0

...and close it without any read being done.  I didn't notice any mmap'ing going on either.  I'm not sure what it was doing with the dup2'd filedescriptor.

Thanks for Jython!
msg10076 (view) Author: Julien Demoor (jdemoor) Date: 2015-05-24.10:52:42
This bug existed in Jython 2.5 as well. FWIW, here's a workaround.

# Jython's open() cannot read the /proc filesystem
def _readfile(self, fn):
    f = os.open(fn, os.O_RDONLY)
    try:
        stream = f.asInputStream()
        out = []
        try:
            while 1:
                out.append(chr(stream.read()))
        except Exception:
            pass
        return ''.join(out)
    finally:
        f.close()
msg10077 (view) Author: Jim Baker (zyasoft) Date: 2015-05-25.22:36:50
Can reproduce (on linux of course).

Zero length is a potential clue here; perhaps we are doing a stat (http://stackoverflow.com/a/7078360/423006) at some point and this procedure short circuits the read.
msg10554 (view) Author: Jim Baker (zyasoft) Date: 2015-12-27.17:37:16
The underlying problem is seen in org.python.core.io.FileIO#readAll in this code snippet:

    toRead = Math.max(0, fileChannel.size() - fileChannel.position());

But per Linux's design, nearly all files in the /proc filesystem have
a file size of 0. See
http://www.tldp.org/LDP/Linux-Filesystem-Hierarchy/html/proc.html. So this means that readAll will just return an empty ByteBuffer, which is converted to an empty string.

Note that the file is still readable; let's define a function like so:

def readall(f):
    chunks = []
    while True:
        chunk = f.read(100)  # completely arbitrary!
        if not chunk:
            break
        else:
            chunks.append(chunk)
    return ''.join(chunks)

Then use it with the example provided:

>>> import os
>>> pid = os.getpid()
>>> from readall import readall
>>> f = open('/proc/{}/status'.format(pid), 'r')
>>> readall(f)
'Name:\tjava\nState:\tS (sleeping)\nTgid:\t11604\nNgid:\t0\nPid:\t11604\nPPid:\t11600\nTracerPid:\t0\nUid:\t1000\t1000\t1000\t1000\nGid:\t1000\t1000\t1000\t1000\nFDSize:\t256\nGroups:\t4 24 27 30 46 109 124 144 1000 \nNStgid:\t11604\nNSpid:\t11604\nNSpgid:\t11600\nNSsid:\t24130\nVmPeak:\t 1968928 kB\nVmSize:\t 1903392 kB\nVmLck:\t       0 kB\nVmPin:\t       0 kB\nVmHWM:\t  124500 kB\nVmRSS:\t  123520 kB\nVmData:\t 1838800 kB\nVmStk:\t     136 kB\nVmExe:\t       4 kB\nVmLib:\t   15732 kB\nVmPTE:\t     516 kB\nVmPMD:\t      24 kB\nVmSwap:\t       0 kB\nThreads:\t18\nSigQ:\t2/31534\nSigPnd:\t0000000000000000\nShdPnd:\t0000000000000000\nSigBlk:\t0000000000000000\nSigIgn:\t0000000000000000\nSigCgt:\t2000000181005ccf\nCapInh:\t0000000000000000\nCapPrm:\t0000000000000000\nCapEff:\t0000000000000000\nCapBnd:\t0000003fffffffff\nSeccomp:\t0\nCpus_allowed:\t3f\nCpus_allowed_list:\t0-5\nMems_allowed:\t00000000,00000001\nMems_allowed_list:\t0\nvoluntary_ctxt_switches:\t1\nnonvoluntary_ctxt_switches:\t2\n'

So that's an easier workaround.

Two possibilities for fixing this bug:

1. Determining that we are working with such a file. It may be sufficient to compute the abspath on the file and determine it's under /proc and we are on Linux, although that seems a lot of special casing. Maybe there's a better predicate for being special?

2. Alternatively we can just read in chunks all such files.

Such chunking will result in some more allocation ByteBuffer overhead, but this is likely a micro-optimization concern.
msg10555 (view) Author: Jim Baker (zyasoft) Date: 2015-12-28.17:31:19
Fixed as of https://hg.python.org/jython/rev/c8245364e407
History
Date User Action Args
2016-01-06 16:00:02zyasoftsetstatus: pending -> closed
2015-12-28 17:31:20zyasoftsetstatus: open -> pending
assignee: zyasoft
resolution: accepted -> fixed
messages: + msg10555
milestone: Jython 2.7.2 -> Jython 2.7.1
2015-12-27 17:40:04zyasoftsettitle: Jython fails to open readable file under /proc on Linux -> Using "read all" ops on /proc files on Linux produces empty strings
2015-12-27 17:37:18zyasoftsetmessages: + msg10554
2015-10-29 22:39:53zyasoftsetmilestone: Jython 2.7.1 -> Jython 2.7.2
2015-06-08 00:38:25Arfreversetnosy: + Arfrever
2015-05-25 22:36:55zyasoftsetresolution: accepted
2015-05-25 22:36:50zyasoftsetnosy: + zyasoft
messages: + msg10077
milestone: Jython 2.7.1
2015-05-24 10:52:43jdemoorsetnosy: + jdemoor
messages: + msg10076
versions: + Jython 2.5
2015-05-21 23:13:13dstrombergcreate