Message4904

Author	NNardelli
Recipients	NNardelli, dgriff, fwierzbicki, mikegremi, pedronis
Date	2009-07-15.15:27:04
SpamBayes Score	1.6653345e-16
Marked as misclassified	No
Message-id	<1247671625.83.0.821999655497.issue550200@psf.upfronthosting.co.za>
In-reply-to

Content
Hello everybody I did some tests with Jython and Java on an EBCDIC platform. The environment: ================ * OMVS on one of the latest z/OS versions. * "java -version" returns: java version "1.6.0" Java(TM) SE Runtime Environment (build pmz3160sr3-20081108_01(SR3)) IBM J9 VM (build 2.4, J2RE 1.6.0 IBM J9 2.4 z/OS s390-31 jvmmz3160-20081107_25433 (JIT enabled, AOT enabled) J9VM - 20081105_025433_bHdSMr JIT - r9_20081031_1330 GC - 20081027_AB) JCL - 20081106_01 * Jython version: Jython 2.5b3 * Jython is copied to jy.jar, own java programs to javalibs.jar * Invoking Jathon like this : java -cp javalibs.jar -jar jy.jar mypyprog.py >out_pyth.txt 2>err_pyth.txt The test: ========= Attached are a Python and a Java program. First calls second at various times, and both output strings in various charset encodings (a mix of ASCII and EBCDIC) to stdout. Python/Jython uses sys.stdout.write() while Java uses System.out.println() The mixed output is in file outbug_binary.txt. Files outbug_expected.txt and outbug_actual.txt provide a more human-readable (in ASCII) view of what the expected and actual program outputs are. Result of the comparizon is visualized with outbug.pdf : output strings are "re-ordered" !!! First remarks and conclusions: 1) Java's println() sends the string to stdout each time, with the right EOL. 2) In the case of Python, write() sends the output to stdout only when it encounters a character x'0A' (ASCII encoding of EOL '\n'). EBCDIC does not use '\n', but character NEL (or NL, Next Line), in EBCDIC x'15', and UNICODE '\u0085'. 3) If we use Python's print() instead of write(), then it appends a x'0A' to whatever string, and there is no re-ordering problem. But x'0A' is no EBCDIC, so we have mixed ASCII and EBCDIC where we do not wish/expect it. 4) My Python's current encodings are: Jython stdout encoding= US-ASCII from sys.stdout.encoding Jython stdin encoding= US-ASCII from sys.stdin.encoding System default encoding: ascii from sys.getdefaultencoding() Locale, default: ('en_US', 'cp037') from locale.getdefaultlocale() Locale, current: (None, None) from locale.getlocale() Locale, preferred encoding: cp037 from locale.getpreferredencoding() Even though I specified the following in my .profile: export LANG=En_US.IBM037 export LC_CTYPE=En_US.IBM037 export LC_ALL=En_US.IBM037 Seems like Python/Jython does not get the correct default encodings from the terminal. 5) sys.stdout.encoding is read-only. I believe this is wrong, programmers should be able to set stdout's encoding on the fly. 6) Alterning encodings by replacing sys.stdout with: * codecs.getwriter("cp037")(sys.__stdout__) * codecs.getwriter("ascii")(sys.__stdout__) does not work: of course, we do not need to encode() and decode() manually, but for write(), the character triggering a flush() to the buffer is always the x'0A'. IMHO, this is a bug: the EOL character should be the one specified by each encoding. 7) Using print() in 6/ instead of sys.stdout.write(string+EOL) does not work, since as in 2/, print() appends x'0A' in EBCDIC mode, not x'15' 8) Tried Jython in interactive mode in OMVS on an z/OS terminal (TN3270). It starts, but input does not work at all, it looks like Jython does not like EBCDIC in StdIn. Probably the same encoding problems as for StdOut. ======= So, what do you think of all this? I hope I provided enough information to help you debug. If not, simply mail me, I can do some tests on the system for you.

Hello everybody

I did some tests with Jython and Java on an EBCDIC platform.

The environment:
================
* OMVS on one of the latest z/OS versions.
* "java -version" returns:
java version "1.6.0"
Java(TM) SE Runtime Environment (build pmz3160sr3-20081108_01(SR3))
IBM J9 VM (build 2.4, J2RE 1.6.0 IBM J9 2.4 z/OS s390-31
jvmmz3160-20081107_25433 (JIT enabled, AOT enabled)
J9VM - 20081105_025433_bHdSMr
JIT  - r9_20081031_1330      
GC   - 20081027_AB)          
JCL  - 20081106_01           
* Jython version: Jython 2.5b3
* Jython is copied to jy.jar, own java programs to javalibs.jar
* Invoking Jathon like this : java -cp javalibs.jar -jar jy.jar
mypyprog.py >out_pyth.txt 2>err_pyth.txt

The test:
=========
Attached are a Python and a Java program. First calls second at various
times, and both output strings in various charset encodings (a mix of
ASCII and EBCDIC) to stdout. Python/Jython uses sys.stdout.write() while
Java uses System.out.println()

The mixed output is in file outbug_binary.txt. Files outbug_expected.txt
and outbug_actual.txt provide a more human-readable (in ASCII) view of
what the expected and actual program outputs are. Result of the
comparizon is visualized with outbug.pdf : output strings are
"re-ordered" !!!

First remarks and conclusions:
1) Java's println() sends the string to stdout each time, with the right
EOL.

2) In the case of Python, write() sends the output to stdout only when
it encounters a character x'0A' (ASCII encoding of EOL '\n'). EBCDIC
does not use '\n', but character NEL (or NL, Next Line), in EBCDIC
x'15', and UNICODE '\u0085'.

3) If we use Python's print() instead of write(), then it appends a
x'0A' to whatever string, and there is no re-ordering problem. But x'0A'
is no EBCDIC, so we have mixed ASCII and EBCDIC where we do not
wish/expect it.

4) My Python's current encodings are:
Jython stdout encoding= US-ASCII                   from sys.stdout.encoding
Jython stdin  encoding= US-ASCII                   from sys.stdin.encoding
System default encoding: ascii                     from
sys.getdefaultencoding()
Locale, default: ('en_US', 'cp037')                from
locale.getdefaultlocale()
Locale, current: (None, None)                      from locale.getlocale()
Locale, preferred encoding: cp037                  from
locale.getpreferredencoding()

Even though I specified the following in my .profile:
export LANG=En_US.IBM037    
export LC_CTYPE=En_US.IBM037
export LC_ALL=En_US.IBM037  

Seems like Python/Jython does not get the correct default encodings from
the terminal.

5) sys.stdout.encoding is read-only. I believe this is wrong,
programmers should be able to set stdout's encoding on the fly.

6) Alterning encodings by replacing sys.stdout with:
* codecs.getwriter("cp037")(sys.__stdout__)
* codecs.getwriter("ascii")(sys.__stdout__)
does not work: of course, we do not need to encode() and decode()
manually, but for write(), the character triggering a flush() to the
buffer is always the x'0A'. IMHO, this is a bug: the EOL character
should be the one specified by each encoding.

7) Using print() in 6/ instead of sys.stdout.write(string+EOL) does not
work, since as in 2/, print() appends x'0A' in EBCDIC mode, not x'15'

8) Tried Jython in interactive mode in OMVS on an z/OS terminal (TN3270).
It starts, but input does not work at all, it looks like Jython does not
like EBCDIC in StdIn.
Probably the same encoding problems as for StdOut.


=======
So, what do you think of all this?
I hope I provided enough information to help you debug.
If not, simply mail me, I can do some tests on the system for you.

History
Date	User	Action	Args
2009-07-15 15:27:05	NNardelli	set	messageid: <1247671625.83.0.821999655497.issue550200@psf.upfronthosting.co.za>
2009-07-15 15:27:05	NNardelli	set	recipients: + NNardelli, pedronis, fwierzbicki, dgriff, mikegremi
2009-07-15 15:27:05	NNardelli	link	issue550200 messages
2009-07-15 15:27:04	NNardelli	create