Message3225

Author leosoto
Recipients leosoto
Date 2008-06-05.20:30:14
SpamBayes Score 0.00028903
Marked as misclassified No
Message-id <1212697815.99.0.164640561695.issue1046@psf.upfronthosting.co.za>
In-reply-to
Content
Seems like regular expressions containing unicode characters are not
well supported by the re engine: 

Jython 2.3a0 on java1.6.0_06
Type "copyright", "credits" or "license" for more information.
>>> import re
>>> re.compile(u"([\u0080-\uffff])").sub(lambda x:x, "foo")
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
	at org.python.modules.sre.SRE_STATE.SRE_CHARSET(SRE_STATE.java:402)
	at org.python.modules.sre.SRE_STATE.SRE_MATCH(SRE_STATE.java:616)
	at org.python.modules.sre.SRE_STATE.SRE_SEARCH(SRE_STATE.java:1171)
	at org.python.modules.sre.PatternObject.subx(PatternObject.java:130)
	at org.python.modules.sre.PatternObject.sub(PatternObject.java:80)
	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
	at
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
	at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
	at java.lang.reflect.Method.invoke(Method.java:597)

java.lang.ArrayIndexOutOfBoundsException:
java.lang.ArrayIndexOutOfBoundsException: 161
>>>
History
Date User Action Args
2008-06-05 20:30:16leosotosetspambayes_score: 0.00028903 -> 0.00028903
recipients: + leosoto
2008-06-05 20:30:16leosotosetspambayes_score: 0.00028903 -> 0.00028903
messageid: <1212697815.99.0.164640561695.issue1046@psf.upfronthosting.co.za>
2008-06-05 20:30:15leosotolinkissue1046 messages
2008-06-05 20:30:14leosotocreate