Issue925333

classification
Title: Cyrillic string
Type: Severity: normal
Components: Core Versions:
Milestone:
process
Status: closed Resolution: fixed
Dependencies: Superseder:
Assigned To: fwierzbicki Nosy List: fwierzbicki
Priority: high Keywords:

Created on 2004-03-29.14:47:41 by anonymous, last changed 2006-11-27.03:06:40 by fwierzbicki.

Messages
msg896 (view) Author: Nobody/Anonymous (nobody) Date: 2004-03-29.14:47:41
Dear, developers!

I'm trying to use Jython (version 2.1) to execute 
programm on Python
from Java. I have some problem with cyrillic string.

I run the programm as follows:

import org.python.util.PythonInterpreter;
import org.python.core.*;

public class TestPy {
  public static void main( String args[] ) {
    System.out.println("Start");

    PythonInterpreter interp = new PythonInterpreter();
    interp.exec("str1='-B> AB@>:01. This is string1  - it is 
bad'");
    interp.set("str2", new PyString("-B> AB@>:02. This is 
string2  - it is OK"));
    interp.exec("print str1\nprint str2");

    System.out.println("Stop");
  }
}

I get the following result:

Start
-B> AB@>:01. This is string1  - it is bad
-B> AB@>:02. This is string2  - it is OK
Stop

str1 has wrong value in Python.
str2 has right value in Python.

I have done some research and found out what causes 
the error.
It happens because in classes
\org\python\core\parser.java
\org\python\core\Py.java

Jython uses the class
java.io.StringBufferInputStream(String s)

This class is deprecated and does not properly convert 
characters into bytes.
I replace this class with the following
java.io.ByteArrayInputStream(byte[] s.getBytes())
and I get the correct result.

The Jython 2.2 alpha 0 has the same problem.

Cyrillic users of Jython (including myself) would really 
appreciate
it if the class
java.io.StringBufferInputStream(String s)
were replaced with
java.io.ByteArrayInputStream(byte[] s.getBytes())

It will provide for the correct processing of non-latin 
string.

Sincerely, Pavel
paul@uib.cherkassy.net
msg897 (view) Author: Frank Wierzbicki (fwierzbicki) Date: 2005-10-31.19:28:43
Logged In: YES 
user_id=193969

I'm assigning this to myself as a reminder to replace
StringBufferInputStream with ByteArrayInputStream in Py and
parser.
msg898 (view) Author: Frank Wierzbicki (fwierzbicki) Date: 2006-11-27.03:06:40
The proposed fix is already in.
History
Date User Action Args
2004-03-29 14:47:41anonymouscreate