Title: Integer formatting emits two minus signs with -2^31
Type: behaviour Severity: normal
Components: Core Versions: Jython 2.7
Status: open Resolution:
Dependencies: Superseder:
Assigned To: Nosy List: jamesmudd, jeff.allen, stefan.richthofer
Priority: normal Keywords: easy

Created on 2018-04-29.15:05:58 by jeff.allen, last changed 2018-05-17.22:02:27 by jeff.allen.

File name Uploaded Description Edit Remove jamesmudd, 2018-05-15.22:23:03
msg11925 (view) Author: Jeff Allen (jeff.allen) Date: 2018-04-29.15:05:57
First observed by Helmut Niemann (thanks):

x = int(-(1<<31))
print ('%d' % x) # emits --2147483648

One also observe it as:

>>> int(-(1<<31)).__format__('d')

It's important that x be an int, not a long, which is in fact a clue to the work-around. Clearly this is a thinking error somewhere in int.__format__, so we should add a regression test in case it recurs.
msg11982 (view) Author: James Mudd (jamesmudd) Date: 2018-05-15.22:23:03
I have had a quick look at this and it's kind of an edge case were i'm not sure if we should fix it.

The issue occurs in: org.python.core.stringlib.IntegerFormatter.format_d(int)

Where the minus sign is handled separate to the value itself. The value is then negated and passed to Java Integer.toString(-value). This is suposed to result in a positive sting representation. However in the edge case of passing in Integer.MIN_VALUE it results in "-2147483648". This can be demonstrated just in java see attached example.

The core of the problem is Integer.MIN_VALUE=-2^31 whereas Integer.MAX_VALUE=2^31-1 ( so -Integer.MIN_VALUE doesn't fit in an int.
msg11986 (view) Author: Stefan Richthofer (stefan.richthofer) Date: 2018-05-16.19:17:06
We should definitely fix it. It is invalid behavior, can cause arbitrary issues, unforeseeable damage, etc. It must be fixed.

That said, an easy fix should be to replace
number = Integer.toString(-value);
number = Long.toString(-((long) value));

This impacts performance of non-edge-case calls. I don't think that format_d is a performance critical function, but - for what it's worth -
we can do it like

number = value == Integer.MIN_VALUE ? "2147483648" : Integer.toString(-value);
msg11987 (view) Author: James Mudd (jamesmudd) Date: 2018-05-16.22:08:23
I'm happy for it to be fixed also. Think I would prefer

number = Long.toString(-((long) value));

It's a bit cleaner and like Stefan I don't think performance is a issue here.
msg11988 (view) Author: James Mudd (jamesmudd) Date: 2018-05-17.16:46:28
Pull request to fix this one
msg11989 (view) Author: Stefan Richthofer (stefan.richthofer) Date: 2018-05-17.19:11:27
I thought about it and would slightly prefer the second version.
- it is still more efficient
- it somewhat clearly documents the nature of an edge case

Your PR looks good, especially thanks for writing a test!
If you don't mind I would just switch the one-liner to the second version on merging. So you won't have to do anything.
Is that okay?
msg11990 (view) Author: Stefan Richthofer (stefan.richthofer) Date: 2018-05-17.19:21:49
Oh, and it occurred to me that we should assert that not a corresponding format_x function of another primitive type is affected by the same kind of issue.
msg11993 (view) Author: James Mudd (jamesmudd) Date: 2018-05-17.20:17:16
Ok I don't mind the one liner either your right its faster. Style wise I would prefer an if e.g

            if(value == Integer.MIN_VALUE) {
                number = Long.toString(-((long)value));
            } else {
                number = Integer.toString(value);

Just had a quick look and I think this bug is present in other places too. Long should be ok as that will go though IntegerFormatter.format_d(BigInteger) but the bug is present in the other int formatting i.e to hex, octal, binary, char. These need tests and fixing to.

Actually no I think the nicest solution might be to call the BigInteger equivalent methods if value == Integer.MIN_VALUE. What do you think of this idea?

I will work on a pull request to fix all the cases I know of.
msg11994 (view) Author: James Mudd (jamesmudd) Date: 2018-05-17.21:26:34
I have updated the pull request

I have added more tests covering the other cases where this could fail. The fix I have implemented is to delegate all the int methods to the BigInteger version counterparts. This makes for quite a simplification as well as fixing this bug.

It is debatable if this will reduce performance, I haven't benchmarked it. I suspect it will be very close, the new way can make use of BigInteger optimisations.
msg11995 (view) Author: Jeff Allen (jeff.allen) Date: 2018-05-17.22:02:27
Thanks both for picking this up.

If we use a clause guarded by value == Integer.MIN_VALUE, in that special case, the rational thing is to return a string constant. Why do math? However, promoting everything to BigInteger is ok I think.
Date User Action Args
2018-05-17 22:02:27jeff.allensetmessages: + msg11995
2018-05-17 21:26:34jamesmuddsetmessages: + msg11994
2018-05-17 20:17:17jamesmuddsetmessages: + msg11993
2018-05-17 19:21:50stefan.richthofersetmessages: + msg11990
2018-05-17 19:11:27stefan.richthofersetmessages: + msg11989
2018-05-17 16:46:29jamesmuddsetmessages: + msg11988
2018-05-16 22:08:23jamesmuddsetmessages: + msg11987
2018-05-16 19:17:06stefan.richthofersetnosy: + stefan.richthofer
messages: + msg11986
2018-05-15 22:23:04jamesmuddsetfiles: +
nosy: + jamesmudd
messages: + msg11982
2018-04-29 15:05:58jeff.allencreate