Message12516
On Jython no-break space (u'\xa0'), figure space (u'\u2007̈́) and narrow no-break space (u'\u202F') are not considered to be space characters. Other space characters listed at https://www.compart.com/en/unicode/category/Zs are.
This affects also string methods like `strip()` and `split()`, but the `re` module doesn't seem to be affected.
Jython 2.7.0 (default:9987c746f838, Apr 29 2015, 02:25:11)
[Java HotSpot(TM) 64-Bit Server VM (Oracle Corporation)] on java1.8.0_201
Type "help", "copyright", "credits" or "license" for more information.
>>> for ordinal in '0020 00A0 1680 2000 2001 2002 2003 2004 2005 2006 2007 2008 2009 200A 202F 205F 3000'.split():
... char = unichr(int(ordinal, 16))
... if not char.isspace():
... print '%s is not space' % ordinal
...
00A0 is not space
2007 is not space
202F is not space
>>>
>>> u'\xa0...\u1680'.strip()
u'\xa0...'
>>> u'.\xa0.'.split()
[u'.\xa0.']
>>> import re
>>> re.split(r'\s+', u'.\xa0.', flags=re.UNICODE)
[u'.', u'.'] |
|
Date |
User |
Action |
Args |
2019-05-13 14:55:32 | pekka.klarck | set | recipients:
+ pekka.klarck |
2019-05-13 14:55:32 | pekka.klarck | set | messageid: <1557759332.16.0.898950359564.issue2772@roundup.psfhosted.org> |
2019-05-13 14:55:32 | pekka.klarck | link | issue2772 messages |
2019-05-13 14:55:31 | pekka.klarck | create | |
|