Message11856

Author jeff.allen
Recipients jeff.allen
Date 2018-03-26.22:10:59
SpamBayes Score -1.0
Marked as misclassified Yes
Message-id <1522102260.58.0.467229070634.issue2659@psf.upfronthosting.co.za>
In-reply-to
Content
Related to #2656: WARNING: Illegal reflective access by org.python.core.PySystemState (file:/C:/Jython/2.7.2a1/jython.jar) to method java.io.Console.encoding()

Unlike most of the other illegal accesses found by the test suite, this one will pop up from more-or-less any interactive use of Jython, hence the separate ticket. Also, we may wish to discuss the solution separately.


Problem:

Bytes written to sys.stdout/err emerge on the real (OS/shell) console untranslated, eventually via System.out/err. And the reverse is true on the way in. So Python needs to know the encoding, when the data is not ascii text, and expects to be told it via sys.stdout.encoding (etc.).

In the case of the JLine console, which replaces System.in/out/err, we take the bytes written by Python and *decode* them to characters, so JLine can encode them again the other side of its character editing. You can never have to many codecs.

When nothing else tells us the console encoding, we obtain it by a reflective call to the private java.io.Console.encoding(), which Java 9 doesn't like and threatens to disallow. If even that fails, we use the property file.encoding, however, this is dubious and generally misleading on Windows.


Solution proposed:

    do without the call that upsets Java 9,
    take a default supplied by the launcher (i.e. CPython), from interrogating sys.stdout.
    stop paying attention to file.encoding
    maybe use UTF-8 as a fixed last resort. (Or should it be None, meaning ASCII?)

I believe this makes the order of precedence (high to low):

    python.console.encoding (from the "post properties" supplied during initialisation)
    python.console.encoding (from system properties e.g. command line)
    python.console.encoding (from registry)
    PYTHONIOENCODING environment variable
    python.console.defaultencoding (from the launcher i.e. CPython) (NEW)
    UTF-8 (one, ASCII?) (NEW)

The last resort fixed encoding will only have effect if you don't use the launcher.

We can't simply specify python.console.encoding from the launcher because then this inference would take precedence over the registry and PYTHONIOENCODING.
History
Date User Action Args
2018-03-26 22:11:00jeff.allensetrecipients: + jeff.allen
2018-03-26 22:11:00jeff.allensetmessageid: <1522102260.58.0.467229070634.issue2659@psf.upfronthosting.co.za>
2018-03-26 22:11:00jeff.allenlinkissue2659 messages
2018-03-26 22:10:59jeff.allencreate