Message9190

Author thatch
Recipients thatch
Date 2014-11-05.06:03:44
SpamBayes Score -1.0
Marked as misclassified Yes
Message-id <1415167424.89.0.853662213387.issue2226@psf.upfronthosting.co.za>
In-reply-to
Content
Found SRE_STATE.java handling SRE_CATEGORTY_UNI_SPACE just delegates this to Character.isWhitespace which has a note specifically about 0x00a0.

http://docs.oracle.com/javase/7/docs/api/java/lang/Character.html#isWhitespace(char) calls out the few exceptions.  isSpaceChar looks better in the docs, but fails to note decimal 9, 10, 11, 12, 28, 29, 30, 31.

CPython also handles 133 (0x0085) specially.  Its unicode category is Cc (same as 10, 13, etc).

Would you accept a patch that makes the jython agree with the CPython implementation here?
History
Date User Action Args
2014-11-05 06:03:44thatchsetmessageid: <1415167424.89.0.853662213387.issue2226@psf.upfronthosting.co.za>
2014-11-05 06:03:44thatchsetrecipients: + thatch
2014-11-05 06:03:44thatchlinkissue2226 messages
2014-11-05 06:03:44thatchcreate