Issue2548
Created on 2017-02-02.16:17:21 by stefan.richthofer, last changed 2017-02-27.04:49:45 by zyasoft.
msg11064 (view) |
Author: Stefan Richthofer (stefan.richthofer) |
Date: 2017-02-02.16:17:21 |
|
CPython 2.7 allows to write something like
s = u'\N{DOUBLE-STRUCK ITALIC SMALL D}'
In principle Jython supports that character by
s = u'\u2146'
just the escape notation is not supported.
This issue is a blocker for sympy-support, see #1777.
|
msg11067 (view) |
Author: Stefan Richthofer (stefan.richthofer) |
Date: 2017-02-02.17:38:11 |
|
Okay, it looks like ucnhash.dat wasn't updated for quite some time (was it ever?). Misc/make_ucnhashdat.py still has "UnicodeData-3.0.0.txt" hard-coded. After some investigation I found that UnicodeData-3.0.0.txt was released in 2001. The newest release as of this writing is 9.0.
So this issue seems to be just a matter of updating ucnhash.dat. Changing title accordingly.
|
msg11068 (view) |
Author: Aaron Meurer (asmeurer) |
Date: 2017-02-02.18:40:19 |
|
Feel free to nosy me on any SymPy blockers.
|
msg11070 (view) |
Author: Stefan Richthofer (stefan.richthofer) |
Date: 2017-02-02.20:07:55 |
|
Aaron: Alright. Pleasure for me.
Creating ucnhash.dat for current Unicode 9.0 turns out to be more challenging than I expected. The script responsible for this Misc/make_ucnhashdat.py seems to be ancient. It writes several values in 16 Bit which exceed value of 65535 for current UnicodeData.txt, e.g.:
Raw size = 184608
writeUcnhashDat()
File "/data/workspace/linux/Jython/stewori/jython/Misc/make_ucnhashdat.py", line 340, in writeUcnhashDat
raw.writeto(outf)
File "/data/workspace/linux/Jython/stewori/jython/Misc/make_ucnhashdat.py", line 188, in writeto
file.write(struct.pack("!H", self.size()))
struct.error: 'H' format requires 0 <= number <= 65535
So I'll have to switch some stuff to 32 bit numbers, which will also require changes in the parser ucnhash.java.
Stay tuned...
|
msg11072 (view) |
Author: Stefan Richthofer (stefan.richthofer) |
Date: 2017-02-03.18:28:50 |
|
Fixed as of https://github.com/jythontools/jython/commit/ebb7f49b47290fe20b1d72991b7a2d37f256fd92.
|
|
Date |
User |
Action |
Args |
2017-02-27 04:49:45 | zyasoft | set | status: pending -> closed |
2017-02-03 18:28:50 | stefan.richthofer | set | status: open -> pending resolution: fixed messages:
+ msg11072 |
2017-02-02 20:07:55 | stefan.richthofer | set | messages:
+ msg11070 |
2017-02-02 18:40:19 | asmeurer | set | nosy:
+ asmeurer messages:
+ msg11068 |
2017-02-02 17:38:11 | stefan.richthofer | set | messages:
+ msg11067 title: Unicode notation u'\N{charachter name}' not supported. -> Unicode u'\N{name}' frequently broken, because ucnhash.dat outdated |
2017-02-02 16:17:21 | stefan.richthofer | create | |
|