Message657

Author pedronis
Recipients
Date 2002-09-03.16:18:21
SpamBayes Score
Marked as misclassified
Message-id
In-reply-to
Content
Logged In: YES 
user_id=61408

oops sorry,

Yes I know System.in is an InputStream that means byte-
oriented. I don't know if it's really mandated to be ascii. 
Both JLS 1st Ed and the current Java API doc are vague 
up to be useless :) on that speaking of data from the 
keyboard, although I suppose a lot of code assume this. 
Anyway that's not the point.

For the theoretical value of it, in a parallel universe where 
both default encoding and the flavor of the bytes from 
System.in would be ebcdic the fix would not work.

Now one can make Jython as it ships work without the fix 
on your system with 

jython -E ConsoleEnc

ConsoleEnc = ascii, ...

or setting python.console.enconding to ConsoleEnc in the 
registry. ConsoleEnc being the encoding of System.in .
That for the console. (See jython -h )

I assume byte-oriented input on files gives ebcdic data,
(is that correct?)

Now you have still found a problem, namely that
Strings passed to the parser are assumed to be
byte sequences in the default enconding (or some 
encoding), represented by zero-extending as strings
of chars also Strings.

But sometimes they are just java strings that would be 
mangled if interpreted in that way (in particular if the
default encoding is not an ascii superset), e.g.
the exec("2") in InteractiveConsole that is the source
of the traceback you see. 

In the parallel universe jython -E cp037 would fail too,
exactly for this problem.

So the two cases for strings passed to parser should
be distinguished, if possible (also considering what should
be the behavior for strings passed from python/java 
code),and dealt differently and separetely.
History
Date User Action Args
2008-02-20 17:17:05adminlinkissue550200 messages
2008-02-20 17:17:05admincreate