Issue2716
Created on 2018-11-10.09:12:22 by k870611, last changed 2019-03-07.23:34:07 by jeff.allen.
msg12176 (view) |
Author: k870611 (k870611) |
Date: 2018-11-10.09:12:21 |
|
for example, In python
a = "hello", b = "hello", "a is b" and "a == b" will return True.
but In jython2.7.0 "a is b" return False, "a == b" return True
|
msg12177 (view) |
Author: Stefan Richthofer (stefan.richthofer) |
Date: 2018-11-10.11:45:57 |
|
Looks like CPython has a different string interning policy. This behavior is fairly internal and I guess your code should better not depend on it. Not sure if we should consider this a bug. How does Jython 2.7.1 behave?
|
msg12178 (view) |
Author: k870611 (k870611) |
Date: 2018-11-10.14:28:00 |
|
Jython 2.7.1 has same issue too, does it is like Java String compare("==" vs .equals())??
|
msg12179 (view) |
Author: Stefan Richthofer (stefan.richthofer) |
Date: 2018-11-10.21:00:32 |
|
Python's "is" corresponds to Java's == and Python's == corresponds to Java's equals. So you see that the same string literal created twice should not be the same object. CPython applies some memory optimization here called string interning. Jython uses interning as well, but currently not on literals. AFAIK it uses it on dictionary keys only.
|
msg12180 (view) |
Author: k870611 (k870611) |
Date: 2018-11-11.07:23:56 |
|
:), Hope to increase memory optimization, makes Jython is totally same as CPython, otherwise it is hard to communicate between both language(Jython and CPython).
|
msg12198 (view) |
Author: Andy Merton (amoebam) |
Date: 2018-12-06.23:18:53 |
|
no, jython should not try to match arbitrary undocumented implementation details of cpython. a brief look at some of cpython's behavior should make it clear why.
for example, "hello" may be interned, but what if we include a non-alphanumeric character?
>>> a = "hello!"
>>> b = "hello!"
>>> a is b
False
except it turns out this is a case where a semicolon is not the same thing as a newline in cpython, because if we use a semicolon we get the opposite result:
>>> a = "hello!"; b = "hello!"
>>> a is b
True
now, obviously the length of the string matters too, because non-alphanumeric strings of length 1 do get interned:
>>> a = "!"
>>> b = "!"
>>> a is b
True
is it just simple literals? no! you also have to consider constant folding, for which you want to produce just the right amount of optimization; you have to replace x + y or x * y with a simple literal, but of course you mustn't combine the two and replace x + y * z with a literal!
>>> a = "aaaaa"
>>> b = "aa" + "aaa"
>>> c = "a" * 5
>>> d = "a" + "a" * 4
>>> a is b, a is c, a is d
(True, True, False)
and obviously * should only work if the result is <= 20 characters, even though a 21-character string will still be interned if you write it out in full ...
>>> a20 = "aaaaaaaaaaaaaaaaaaaa"
>>> a21 = "aaaaaaaaaaaaaaaaaaaaa"
>>> b20 = "a" * 20
>>> b21 = "a" * 21
>>> c21 = "aaaaaaaaaaaaaaaaaaaaa"
>>> a20 is b20, a21 is b21, a21 is c21
(True, False, True)
... but, of course, these last two examples produce different results in cpython 3 ...
in short: it would be foolish to try to make jython match cpython here. code that relies on implementation details like these is not correct python, and jython should not try to make it work.
pass your strings through the intern() function instead, if you really need to use "is" instead of "==".
|
msg12199 (view) |
Author: Stefan Richthofer (stefan.richthofer) |
Date: 2018-12-07.09:05:32 |
|
Andy, thanks for clarifying the situation. I wasn't aware that it's that "bad". Given this view, this issue is certainly a "won't fix".
|
|
Date |
User |
Action |
Args |
2019-03-07 23:34:07 | jeff.allen | set | status: pending -> closed |
2018-12-07 09:05:33 | stefan.richthofer | set | priority: low status: open -> pending resolution: wont fix messages:
+ msg12199 |
2018-12-06 23:18:55 | amoebam | set | nosy:
+ amoebam messages:
+ msg12198 |
2018-11-11 07:23:57 | k870611 | set | messages:
+ msg12180 |
2018-11-10 21:00:32 | stefan.richthofer | set | messages:
+ msg12179 |
2018-11-10 14:28:00 | k870611 | set | messages:
+ msg12178 |
2018-11-10 11:45:57 | stefan.richthofer | set | nosy:
+ stefan.richthofer messages:
+ msg12177 |
2018-11-10 09:12:22 | k870611 | create | |
|