Issue1046

classification
Title: Problems with re and unicode
Type: Severity: normal
Components: Core Versions: 2.5alpha1
Milestone:
process
Status: closed Resolution:
Dependencies: Superseder:
Assigned To: zyasoft Nosy List: leosoto, zyasoft
Priority: Keywords:

Created on 2008-06-05.20:30:15 by leosoto, last changed 2008-07-14.16:28:35 by zyasoft.

Messages
msg3225 (view) Author: Leonardo Soto (leosoto) Date: 2008-06-05.20:30:14
Seems like regular expressions containing unicode characters are not
well supported by the re engine: 

Jython 2.3a0 on java1.6.0_06
Type "copyright", "credits" or "license" for more information.
>>> import re
>>> re.compile(u"([\u0080-\uffff])").sub(lambda x:x, "foo")
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
	at org.python.modules.sre.SRE_STATE.SRE_CHARSET(SRE_STATE.java:402)
	at org.python.modules.sre.SRE_STATE.SRE_MATCH(SRE_STATE.java:616)
	at org.python.modules.sre.SRE_STATE.SRE_SEARCH(SRE_STATE.java:1171)
	at org.python.modules.sre.PatternObject.subx(PatternObject.java:130)
	at org.python.modules.sre.PatternObject.sub(PatternObject.java:80)
	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
	at
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
	at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
	at java.lang.reflect.Method.invoke(Method.java:597)

java.lang.ArrayIndexOutOfBoundsException:
java.lang.ArrayIndexOutOfBoundsException: 161
>>>
msg3324 (view) Author: Jim Baker (zyasoft) Date: 2008-07-14.16:28:35
Wide char sets are fixed
History
Date User Action Args
2008-07-14 16:28:35zyasoftsetstatus: open -> closed
versions: + 2.5alpha1
nosy: + zyasoft
messages: + msg3324
assignee: zyasoft
components: + Core
2008-06-05 20:30:15leosotocreate