Message4912

Author NNardelli
Recipients NNardelli, dgriff, fwierzbicki, mikegremi, pedronis
Date 2009-07-16.11:49:02
SpamBayes Score 4.51843e-11
Marked as misclassified No
Message-id <1247744943.29.0.145334511976.issue550200@psf.upfronthosting.co.za>
In-reply-to
Content
A little precision. Using following Java code :

  Charset CS_curr = Charset.defaultCharset();
  System.out.println( "++ Current Charset locale    name: "+
CS_curr.displayName() );
  System.out.println( "++ Current Charset canonical name: "+
CS_curr.name() );
  SortedMap<String, Charset> CS_list = Charset.availableCharsets();
  System.out.println( "++ Charsets available on this system: " );
  int i=1;
  for ( String CS_name: CS_list.keySet() ) {
    System.out.println( "    #" + i + "   " + CS_name ) ;
    i=i+1;
  }

I get:
++ Current Charset locale    name: IBM1047
++ Current Charset canonical name: IBM1047
++ Charsets available on this system:     
    #1   Big5                             
    #2   Big5-HKSCS                       
    #3   CESU-8                           
    #4   EUC-JP                           
    #5   EUC-KR                           
    #6   GB18030                          
    #7   GB2312                           
    #8   GBK                              
    #9   hp-roman8                        
    #10   IBM-Thai                        
    #11   IBM00858                        
    #12   IBM00924                        
    #13   IBM01140                        
    #14   IBM01141                        
    #15   IBM01142                        
    #16   IBM01143                        
    #17   IBM01144                        
    #18   IBM01145                        
    #19   IBM01146                        
    #20   IBM01147                        
    #21   IBM01148                        
    #22   IBM01149                        
    #23   IBM037                          
    #24   IBM1026                         
    #25   IBM1047                         
    #26   IBM273                          
    #27   IBM277                          
(... list continues, total of 217 elements ...)

Looks like Java doesn't take the $LANG, but the MVS C-compiler defined
EBCDIC variant IBM1047. Jython seems not to know this variant.
Is it possible that:
1) Jython, through Java, asks for the charset encoding
2) Java answers something likes "IBM1047"
3) Jython checks for this string, but does not recognize it (not defined
in Python/Jython)
4) Jython then uses the default for stdin and stdout: ASCII
???

In this case, one elegant way to solve the problem would simply be to
check if Java knows the current charset encoding, and if yes, then rely
on Java for the .encode() and .decode() methods.
Therefore, in a few lines of code, you could support all encodings that
Java knows, in addition to the ones which are defined inside of Jython.
History
Date User Action Args
2009-07-16 11:49:03NNardellisetmessageid: <1247744943.29.0.145334511976.issue550200@psf.upfronthosting.co.za>
2009-07-16 11:49:03NNardellisetrecipients: + NNardelli, pedronis, fwierzbicki, dgriff, mikegremi
2009-07-16 11:49:03NNardellilinkissue550200 messages
2009-07-16 11:49:02NNardellicreate