Issue1571
Created on 2010-03-15.01:22:49 by pr3d4t0r, last changed 2010-08-15.17:14:01 by zyasoft.
msg5565 (view) |
Author: Eugene Ciurana (pr3d4t0r) |
Date: 2010-03-15.01:22:48 |
|
Huge performance hit when issuing a million or more calls to int(string) because it uses the BigInteger parser instead of Long's or Integer's. This results in 2x to 10x slower performance for the same code running on CPython, equivalent version.
|
msg5567 (view) |
Author: Jim Baker (zyasoft) |
Date: 2010-03-15.19:45:25 |
|
Relevant code is PyInteger#int_new, which call PyString#atoi; if an OverflowException, it then tries to build as a PyLong.
We can simply capture the NumberFormatException instead from java.lang.Integer#parseInt. This should reduce overhead accordingly.
|
msg5584 (view) |
Author: Jim Baker (zyasoft) |
Date: 2010-03-21.22:08:30 |
|
Some test results for 2.5.1 of Jython show that we have to let hotspot heat up first, in which case it eventually outperforms CPython 2.6.4:
jimbaker:~ jbaker$ released-jython2.5.1/bin/jython -m timeit -n 1000 "int('12345')"
1000 loops, best of 3: 12 usec per loop
jimbaker:~ jbaker$ released-jython2.5.1/bin/jython -m timeit -n 10000 "int('12345')"
10000 loops, best of 3: 3.3 usec per loop
jimbaker:~ jbaker$ released-jython2.5.1/bin/jython -m timeit -n 100000 "int('12345')"
100000 loops, best of 3: 0.53 usec per loop
jimbaker:~ jbaker$ released-jython2.5.1/bin/jython -m timeit -n 1000000 "int('12345')"
1000000 loops, best of 3: 0.416 usec per loop
jimbaker:~ jbaker$ released-jython2.5.1/bin/jython -m timeit -n 10000000 "int('12345')"
10000000 loops, best of 3: 0.402 usec per loop
vs CPython 2.6.4:
jimbaker:~ jbaker$ python -m timeit -n 10000 "int('12345')"
10000 loops, best of 3: 0.731 usec per loop
I doubt there's any caching behavior here on the string itself. A naive variant I wrote is not nearly as fast for large number of iterations, almost certainly because it doesn't inline as well:
jimbaker:jython jbaker$ dist/bin/jython -m timeit -n 10000 "int('12345')"
10000 loops, best of 3: 8.7 usec per loop
jimbaker:jython jbaker$ dist/bin/jython -m timeit -n 100000 "int('12345')"
100000 loops, best of 3: 8.69 usec per loop
I believe this is because it doesn't aggressively inline. We can do better by removing all conditional logic within the inside loop, such as around base calcs, but that requires more work for this specialization.
|
msg5585 (view) |
Author: Philip Jenvey (pjenvey) |
Date: 2010-03-21.22:31:29 |
|
I think part of the call path down to atoi/atol is unnecessary now too. I'm not sure we even need separate atoi/atol methods now, I don't think we ever need to ask for a Java primitive int or long from a string object, what we always ask for is a Python number object from a string. Especially with the int/long converging
In that case we could combine the two functions and remove the OverflowException
Not to mention that PyString implementing __int/long/float__ is annoying. We still have an occasional case where you can pass a '1' into as a numeric argument and it's treated as valid
|
msg5971 (view) |
Author: Jim Baker (zyasoft) |
Date: 2010-08-15.17:14:01 |
|
pjenvey fixed this according to #jython by shortening the call path, so closing for now
|
|
Date |
User |
Action |
Args |
2010-08-15 17:14:01 | zyasoft | set | status: open -> closed messages:
+ msg5971 |
2010-03-21 22:31:29 | pjenvey | set | nosy:
+ pjenvey messages:
+ msg5585 |
2010-03-21 22:08:31 | zyasoft | set | messages:
+ msg5584 |
2010-03-15 19:45:26 | zyasoft | set | priority: normal assignee: zyasoft resolution: accepted messages:
+ msg5567 nosy:
+ zyasoft |
2010-03-15 01:22:50 | pr3d4t0r | create | |
|