Issue550200

classification
Title: Jython does not work on ebcdic platforms
Type: Severity: normal
Components: Core Versions: Jython 2.5
process
Status: open Resolution:
Dependencies: Superseder:
Assigned To: fwierzbicki Nosy List: NNardelli, dgriff, fwierzbicki, mikegremi, pedronis, zyasoft
Priority: normal Keywords:

Created on 2002-04-29.15:03:00 by dgriff, last changed 2014-06-14.00:49:27 by zyasoft.

Files
File name Uploaded Description Edit Remove
jython_ebcdic_bugs.zip NNardelli, 2009-07-15.15:27:04 6 files
unnamed dgriff, 2009-09-11.13:58:03
Enc_print.java NNardelli, 2009-09-14.14:11:05 Java class used by trois_flushbug.py
Messages
msg653 (view) Author: David Griffiths (dgriff) Date: 2002-04-29.15:03:00
I copied jython.jar to our z/OS machine and typed
java -classpath jython.jar org.python.util.jython
and this is what happened:

Jython 2.1 on java1.3.1 (JIT: jitc)
Traceback (innermost last):
  (no code object) at line 0
  File "<string>", line 1
        ¬ZZ
        ^
SyntaxError: Lexical error at line 1, column 1.  
Encountered: "\u0016" (22), after : ""

Luckily there is a simple fix. Just change all 
references to the deprecated StringBufferInputStream 
to use ByteArrayInputStream instead. This is in 
parser.java and Py.java. There are only four such 
uses. Here is an example:

//node = parse(new StringBufferInputStream(string),
node = parse(new ByteArrayInputStream(string.getBytes
()),


After I made that change it worked (at least I now 
have a command line prompt and basic commands seem to 
work, I haven't done any serious testing yet).

Cheers,

Dave
msg654 (view) Author: Samuele Pedroni (pedronis) Date: 2002-09-03.14:42:37
Logged In: YES 
user_id=61408

If I understand correctly you got the traceback even
before the prompt appeared, and with the workaround
things work, that means you can execute staments at the 
prompt.

From that and context I would guess that
the default encoding of the used JVM is ebcdic

and on the other hand from System.in you get something
ascii flavored, 

are these guesses correct?
msg655 (view) Author: David Griffiths (dgriff) Date: 2002-09-03.15:02:15
Logged In: YES 
user_id=529567

Hi, yes all those guesses are correct.

I know it sounds weird that System.in is ascii but it's a 
legacy of the times before Sun went over to Reader classes.
msg656 (view) Author: Samuele Pedroni (pedronis) Date: 2002-09-03.15:12:07
Logged In: YES 
user_id=61408

If I understand correctly you got the traceback even
before the prompt appeared, and with the workaround
things work, that means you can execute staments at the 
prompt.

From that and context I would guess that
the default encoding of the used JVM is ebcdic

and on the other hand from System.in you get something
ascii flavored, 

are these guesses correct?
msg657 (view) Author: Samuele Pedroni (pedronis) Date: 2002-09-03.16:18:21
Logged In: YES 
user_id=61408

oops sorry,

Yes I know System.in is an InputStream that means byte-
oriented. I don't know if it's really mandated to be ascii. 
Both JLS 1st Ed and the current Java API doc are vague 
up to be useless :) on that speaking of data from the 
keyboard, although I suppose a lot of code assume this. 
Anyway that's not the point.

For the theoretical value of it, in a parallel universe where 
both default encoding and the flavor of the bytes from 
System.in would be ebcdic the fix would not work.

Now one can make Jython as it ships work without the fix 
on your system with 

jython -E ConsoleEnc

ConsoleEnc = ascii, ...

or setting python.console.enconding to ConsoleEnc in the 
registry. ConsoleEnc being the encoding of System.in .
That for the console. (See jython -h )

I assume byte-oriented input on files gives ebcdic data,
(is that correct?)

Now you have still found a problem, namely that
Strings passed to the parser are assumed to be
byte sequences in the default enconding (or some 
encoding), represented by zero-extending as strings
of chars also Strings.

But sometimes they are just java strings that would be 
mangled if interpreted in that way (in particular if the
default encoding is not an ascii superset), e.g.
the exec("2") in InteractiveConsole that is the source
of the traceback you see. 

In the parallel universe jython -E cp037 would fail too,
exactly for this problem.

So the two cases for strings passed to parser should
be distinguished, if possible (also considering what should
be the behavior for strings passed from python/java 
code),and dealt differently and separetely.
msg658 (view) Author: Samuele Pedroni (pedronis) Date: 2002-09-04.19:59:17
Logged In: YES 
user_id=61408

Concretely,

exec("2") shows that CompilerFlags is not the right place
for enconding,

parse(InputStream) should grow a separate encoding 
argument,

the versions taking a String should just assume that it is 
unicode

sequences of bytes as Strings to unicode decoding wrt.
to an enconding should be moved up the call chain.

At least that's how I see things.
msg659 (view) Author: Michael Greifeneder (mikegremi) Date: 2003-11-05.10:05:45
Logged In: YES 
user_id=246532

I also have to comment out
in Method org.python.util.InteractiveConsole.interact(String) 
the line with "exec("2")".
// Dummy exec in order to speed up response on first 
command
//exec("2");

I don't why this is needed, but on OS/390 it produces a failure.

Mike
msg3744 (view) Author: Frank Wierzbicki (fwierzbicki) Date: 2008-11-03.22:06:37
Does anyone have a z/OS box handy to try this on?  I suspect trunk or
the 2.5 beta should work since the original poster identified the
deprecated StringBufferInputStream as the culprit, and that is long gone.
msg3746 (view) Author: David Griffiths (dgriff) Date: 2008-11-03.23:12:19
I'm still - for my sins - supporting Java on z/OS, so I'll give it a
bash tomorrow.
msg3749 (view) Author: David Griffiths (dgriff) Date: 2008-11-04.13:07:43
Ok, it works, kind of. But only if you give it an ascii script. And the
output is also in ascii. This is with Java version "J2RE 1.5.0 IBM z/OS
build jclmz31dev-20081030".

BTW the installation worked fine in headless mode. I just did a
standalone install.
msg3750 (view) Author: Frank Wierzbicki (fwierzbicki) Date: 2008-11-04.14:39:38
dgriff: wow I wasn't really expecting a response, much less from the
original poster :).  btw the encoding flag has changed from -E to -C but
I haven't really given it a good try yet.  I will see what encodings I
have locally and see what happens when I change my console - though I
won't be able to get to it in for a little while.  I've assigned this
bug to myself.  I'll let you know when to give it another try if you are
willing, thanks!
msg4904 (view) Author: Nardelli (NNardelli) Date: 2009-07-15.15:27:04
Hello everybody

I did some tests with Jython and Java on an EBCDIC platform.

The environment:
================
* OMVS on one of the latest z/OS versions.
* "java -version" returns:
java version "1.6.0"
Java(TM) SE Runtime Environment (build pmz3160sr3-20081108_01(SR3))
IBM J9 VM (build 2.4, J2RE 1.6.0 IBM J9 2.4 z/OS s390-31
jvmmz3160-20081107_25433 (JIT enabled, AOT enabled)
J9VM - 20081105_025433_bHdSMr
JIT  - r9_20081031_1330      
GC   - 20081027_AB)          
JCL  - 20081106_01           
* Jython version: Jython 2.5b3
* Jython is copied to jy.jar, own java programs to javalibs.jar
* Invoking Jathon like this : java -cp javalibs.jar -jar jy.jar
mypyprog.py >out_pyth.txt 2>err_pyth.txt

The test:
=========
Attached are a Python and a Java program. First calls second at various
times, and both output strings in various charset encodings (a mix of
ASCII and EBCDIC) to stdout. Python/Jython uses sys.stdout.write() while
Java uses System.out.println()

The mixed output is in file outbug_binary.txt. Files outbug_expected.txt
and outbug_actual.txt provide a more human-readable (in ASCII) view of
what the expected and actual program outputs are. Result of the
comparizon is visualized with outbug.pdf : output strings are
"re-ordered" !!!

First remarks and conclusions:
1) Java's println() sends the string to stdout each time, with the right
EOL.

2) In the case of Python, write() sends the output to stdout only when
it encounters a character x'0A' (ASCII encoding of EOL '\n'). EBCDIC
does not use '\n', but character NEL (or NL, Next Line), in EBCDIC
x'15', and UNICODE '\u0085'.

3) If we use Python's print() instead of write(), then it appends a
x'0A' to whatever string, and there is no re-ordering problem. But x'0A'
is no EBCDIC, so we have mixed ASCII and EBCDIC where we do not
wish/expect it.

4) My Python's current encodings are:
Jython stdout encoding= US-ASCII                   from sys.stdout.encoding
Jython stdin  encoding= US-ASCII                   from sys.stdin.encoding
System default encoding: ascii                     from
sys.getdefaultencoding()
Locale, default: ('en_US', 'cp037')                from
locale.getdefaultlocale()
Locale, current: (None, None)                      from locale.getlocale()
Locale, preferred encoding: cp037                  from
locale.getpreferredencoding()

Even though I specified the following in my .profile:
export LANG=En_US.IBM037    
export LC_CTYPE=En_US.IBM037
export LC_ALL=En_US.IBM037  

Seems like Python/Jython does not get the correct default encodings from
the terminal.

5) sys.stdout.encoding is read-only. I believe this is wrong,
programmers should be able to set stdout's encoding on the fly.

6) Alterning encodings by replacing sys.stdout with:
* codecs.getwriter("cp037")(sys.__stdout__)
* codecs.getwriter("ascii")(sys.__stdout__)
does not work: of course, we do not need to encode() and decode()
manually, but for write(), the character triggering a flush() to the
buffer is always the x'0A'. IMHO, this is a bug: the EOL character
should be the one specified by each encoding.

7) Using print() in 6/ instead of sys.stdout.write(string+EOL) does not
work, since as in 2/, print() appends x'0A' in EBCDIC mode, not x'15'

8) Tried Jython in interactive mode in OMVS on an z/OS terminal (TN3270).
It starts, but input does not work at all, it looks like Jython does not
like EBCDIC in StdIn.
Probably the same encoding problems as for StdOut.


=======
So, what do you think of all this?
I hope I provided enough information to help you debug.
If not, simply mail me, I can do some tests on the system for you.
msg4912 (view) Author: Nardelli (NNardelli) Date: 2009-07-16.11:49:02
A little precision. Using following Java code :

  Charset CS_curr = Charset.defaultCharset();
  System.out.println( "++ Current Charset locale    name: "+
CS_curr.displayName() );
  System.out.println( "++ Current Charset canonical name: "+
CS_curr.name() );
  SortedMap<String, Charset> CS_list = Charset.availableCharsets();
  System.out.println( "++ Charsets available on this system: " );
  int i=1;
  for ( String CS_name: CS_list.keySet() ) {
    System.out.println( "    #" + i + "   " + CS_name ) ;
    i=i+1;
  }

I get:
++ Current Charset locale    name: IBM1047
++ Current Charset canonical name: IBM1047
++ Charsets available on this system:     
    #1   Big5                             
    #2   Big5-HKSCS                       
    #3   CESU-8                           
    #4   EUC-JP                           
    #5   EUC-KR                           
    #6   GB18030                          
    #7   GB2312                           
    #8   GBK                              
    #9   hp-roman8                        
    #10   IBM-Thai                        
    #11   IBM00858                        
    #12   IBM00924                        
    #13   IBM01140                        
    #14   IBM01141                        
    #15   IBM01142                        
    #16   IBM01143                        
    #17   IBM01144                        
    #18   IBM01145                        
    #19   IBM01146                        
    #20   IBM01147                        
    #21   IBM01148                        
    #22   IBM01149                        
    #23   IBM037                          
    #24   IBM1026                         
    #25   IBM1047                         
    #26   IBM273                          
    #27   IBM277                          
(... list continues, total of 217 elements ...)

Looks like Java doesn't take the $LANG, but the MVS C-compiler defined
EBCDIC variant IBM1047. Jython seems not to know this variant.
Is it possible that:
1) Jython, through Java, asks for the charset encoding
2) Java answers something likes "IBM1047"
3) Jython checks for this string, but does not recognize it (not defined
in Python/Jython)
4) Jython then uses the default for stdin and stdout: ASCII
???

In this case, one elegant way to solve the problem would simply be to
check if Java knows the current charset encoding, and if yes, then rely
on Java for the .encode() and .decode() methods.
Therefore, in a few lines of code, you could support all encodings that
Java knows, in addition to the ones which are defined inside of Jython.
msg4931 (view) Author: Nardelli (NNardelli) Date: 2009-07-20.14:34:05
Another precision: the .py files were written in ASCII format on a PC,
and were FTP-ed to the mainframe.
So, per default, Jython reads ASCII files, and not platform specific ones.
msg5135 (view) Author: Frank Wierzbicki (fwierzbicki) Date: 2009-09-11.13:37:31
I've upped the priority to normal, since NNardelli's analysis gives some
real hope of getting this figured out.  NNardelli -- I'm not likely to
be able to look at this for about 2 weeks due to some other commitments,
but when I do, it would be great if I could get some kind of access to
Z/OS to try stuff out - any ideas on how I could get Z/OS access?
msg5136 (view) Author: David Griffiths (dgriff) Date: 2009-09-11.13:58:04
Hi Frank, can't give you access but I can run tests for you if you like.

Cheers,

Dave

On Fri, Sep 11, 2009 at 2:37 PM, Frank Wierzbicki <report@bugs.jython.org>wrote:

>
> Frank Wierzbicki <fwierzbicki@users.sourceforge.net> added the comment:
>
> I've upped the priority to normal, since NNardelli's analysis gives some
> real hope of getting this figured out.  NNardelli -- I'm not likely to
> be able to look at this for about 2 weeks due to some other commitments,
> but when I do, it would be great if I could get some kind of access to
> Z/OS to try stuff out - any ideas on how I could get Z/OS access?
>
> ----------
> priority: low -> normal
>
> _______________________________________
> Jython tracker <report@bugs.jython.org>
> <http://bugs.jython.org/issue550200>
> _______________________________________
>
msg5137 (view) Author: David Griffiths (dgriff) Date: 2009-09-11.14:00:33
Yikes, should have selected plain text, sorry about that.

Dave

---------- Forwarded message ----------
From: David Griffiths <david.griffiths@gmail.com>
Date: Fri, Sep 11, 2009 at 2:58 PM
Subject: Re: [issue550200] Jython doesn\\\'t work on ebcdic platforms
To: Jython tracker <report@bugs.jython.org>

Hi Frank, can't give you access but I can run tests for you if you like.

Cheers,

Dave

On Fri, Sep 11, 2009 at 2:37 PM, Frank Wierzbicki
<report@bugs.jython.org> wrote:
>
> Frank Wierzbicki <fwierzbicki@users.sourceforge.net> added the comment:
>
> I've upped the priority to normal, since NNardelli's analysis gives some
> real hope of getting this figured out.  NNardelli -- I'm not likely to
> be able to look at this for about 2 weeks due to some other commitments,
> but when I do, it would be great if I could get some kind of access to
> Z/OS to try stuff out - any ideas on how I could get Z/OS access?
>
> ----------
> priority: low -> normal
>
> _______________________________________
> Jython tracker <report@bugs.jython.org>
> <http://bugs.jython.org/issue550200>
> _______________________________________
msg5138 (view) Author: Frank Wierzbicki (fwierzbicki) Date: 2009-09-11.15:04:32
David, thanks -- I'll keep that in mind if I don't get access some other
way.
msg5155 (view) Author: Nardelli (NNardelli) Date: 2009-09-14.14:11:05
Frank noticed that I forgot a java class. It's the one used by the
Jython program when calling :
from enc_print import Enc_print

It's attached as Enc_print.java now.
msg5156 (view) Author: Nardelli (NNardelli) Date: 2009-09-14.14:12:39
I can run tests on a z/OS for you, if you like.
msg7395 (view) Author: Frank Wierzbicki (fwierzbicki) Date: 2012-08-13.18:33:14
Since I don't have access to z/OS this one remains tough. I'd be willing to review patches from folks on z/OS - Nardelli has a nice suggestion for a workaround if someone wants to pursue it.
msg7400 (view) Author: David Griffiths (dgriff) Date: 2012-08-14.11:45:46
If I try the very first test reported under this bug:

java -classpath jython-standalone-2.5.3.jar org.python.util.jython

then I get first in ebcdic:

Jython 2.5.3 (2.5:c56500f08d34+, Aug 13 2012, 14:54:35)
[IBM J9 VM (IBM Corporation)] on java1.5.0
Type "help", "copyright", "credits" or "license" for more information.

followed by this in ascii:

LookupError: unknown encoding 'ibm-1047'

Still happy to run tests if you like!

Cheers,

Dave
msg7402 (view) Author: Frank Wierzbicki (fwierzbicki) Date: 2012-08-14.16:49:06
David: it's cool that you are still listening after all these years - I think the Oracle acquisition of Sun killed my spike on really trying to fix this. I'll have to take another look - if I had an example that at least simulated the problem on non z/OS I could give it a try. I don't remember if Nardelli's examples provide this or not.
msg8561 (view) Author: Jim Baker (zyasoft) Date: 2014-05-22.02:41:38
With the work on #1066, we might also be able to take advantage of Java's cp1047 support for EBCDIC, see http://bugs.java.com/bugdatabase/view_bug.do?bug_id=4891216

Related to this: https://mail.python.org/pipermail/python-dev/2007-October/074991.html
msg8641 (view) Author: Jim Baker (zyasoft) Date: 2014-06-14.00:49:27
IBM-1047 encodings should now be available as of http://hg.python.org/jython/rev/6c718e5e9ae9. See #1066 for more details.
History
Date User Action Args
2014-06-14 00:49:27zyasoftsetmessages: + msg8641
2014-05-22 02:41:39zyasoftsetnosy: + zyasoft
messages: + msg8561
title: Jython doesn\\\'t work on ebcdic platforms -> Jython does not work on ebcdic platforms
2013-02-25 18:51:32fwierzbickisetversions: + Jython 2.5, - 2.5.1
2012-08-14 16:49:57fwierzbickisetassignee: fwierzbicki
2012-08-14 16:49:07fwierzbickisetmessages: + msg7402
2012-08-14 11:45:47dgriffsetmessages: + msg7400
2012-08-13 18:33:14fwierzbickisetmessages: + msg7395
2012-01-03 00:12:49fwierzbickisetassignee: fwierzbicki -> (no value)
2009-09-14 14:12:39NNardellisetmessages: + msg5156
2009-09-14 14:11:05NNardellisetfiles: + Enc_print.java
messages: + msg5155
2009-09-11 15:04:32fwierzbickisetmessages: + msg5138
2009-09-11 14:00:33dgriffsetmessages: + msg5137
2009-09-11 13:58:04dgriffsetfiles: + unnamed
messages: + msg5136
2009-09-11 13:37:32fwierzbickisetpriority: low -> normal
messages: + msg5135
2009-07-20 14:34:05NNardellisetmessages: + msg4931
2009-07-16 11:49:03NNardellisetmessages: + msg4912
2009-07-15 15:27:05NNardellisetfiles: + jython_ebcdic_bugs.zip
nosy: + NNardelli
messages: + msg4904
2009-03-30 16:41:01fwierzbickisetversions: + 2.5.1
2008-11-04 14:39:39fwierzbickisetassignee: fwierzbicki
messages: + msg3750
2008-11-04 13:07:44dgriffsetmessages: + msg3749
2008-11-03 23:12:19dgriffsetmessages: + msg3746
2008-11-03 22:06:37fwierzbickisetnosy: + fwierzbicki
messages: + msg3744
2002-04-29 15:03:00dgriffcreate