Issue1612
Created on 2010-05-18.18:33:04 by mcieslik, last changed 2015-01-14.00:48:23 by santa4nt.
msg5769 (view) |
Author: Marcin (mcieslik) |
Date: 2010-05-18.18:33:03 |
|
It takes ~ 300x longer to create instances of array.array in Jython2.5.1 vs Python2.6 and Python3.1
e.g. the following:
from array import array
array('b', large_string)
$ python2.6 profile_array.py
0.0104711055756
$ python3.1 profile_array.py
0.00699281692505
$ jython profile_array.py
3.00600004196
$ jython --version
Jython 2.5.1
|
msg5770 (view) |
Author: (doublep) |
Date: 2010-05-19.11:21:16 |
|
Did you measure total program time?
|
msg5771 (view) |
Author: Marcin (mcieslik) |
Date: 2010-05-19.12:09:26 |
|
The 3s of jython profile_array.py do **NOT** include the JVM start-up time, so it is 'wall-clock' time of the loop.
this is what is in the attached script:
start = time()
for i in range(10000):
array('b', large_string)
stop = time()
|
msg6186 (view) |
Author: Jim Baker (zyasoft) |
Date: 2010-10-17.17:21:48 |
|
The problem here is that we copy the string. In 2.6 this can be avoided by supporting a string to back an array. This can (and should) be part of a general support for memoryview.
|
msg6187 (view) |
Author: Jim Baker (zyasoft) |
Date: 2010-10-17.17:24:14 |
|
better title - "Jython ____" is just noise here
|
msg9375 (view) |
Author: Jim Baker (zyasoft) |
Date: 2015-01-12.16:10:34 |
|
The reported performance problem is still seen in 2.7.0 beta 4.
In reviewing CPython 2.7's arraymodule.c, I don't see any support for copy-on-write semantics to do this speedup. Instead it's just a straightforward memcpy in the frombytes function.
|
msg9376 (view) |
Author: Jim Baker (zyasoft) |
Date: 2015-01-12.17:35:18 |
|
So the additional overhead here has a simple root cause: unlike CPython, Jython uses the same method, PyArray.fromStream, to read from an input stream into a given array. Although the read should be reasonably fast/inlineable (but more overhead than simply looping through the string), the write performance into the array is very slow since it uses java.lang.reflect.Array, in this case java.lang.reflect.Array#setByte.
Some simple specialization would speed things up considerably, much as was done with CPython.
Changing misleading title! (Copy-on-write would still be interesting, and perhaps more feasible on Jython.)
|
msg9381 (view) |
Author: Santoso Wijaya (santa4nt) |
Date: 2015-01-14.00:48:21 |
|
@zyasoft Something like the patch I have in mind? I can get a better profile number with this naive "bulk" put() implementation sans-copy-on-write optimization, but it's modest at best.
|
|
Date |
User |
Action |
Args |
2015-01-14 00:48:23 | santa4nt | set | files:
+ issue1612.patch keywords:
+ patch messages:
+ msg9381 |
2015-01-13 19:15:58 | santa4nt | set | nosy:
+ santa4nt type: behaviour |
2015-01-12 17:35:18 | zyasoft | set | messages:
+ msg9376 title: array.array copies strings instead of using them to back the new array -> array.array should use specialized bulk operations to initialize from an input source, such as a string |
2015-01-12 16:10:35 | zyasoft | set | messages:
+ msg9375 |
2015-01-12 07:36:56 | zyasoft | set | resolution: remind |
2013-02-26 17:33:07 | fwierzbicki | set | nosy:
+ fwierzbicki |
2013-02-25 19:04:22 | fwierzbicki | set | versions:
+ Jython 2.7, - 2.5.1 |
2010-10-17 17:24:15 | zyasoft | set | messages:
+ msg6187 title: Jython copies strings instead of using them to back an array -> array.array copies strings instead of using them to back the new array |
2010-10-17 17:21:49 | zyasoft | set | priority: low nosy:
+ zyasoft messages:
+ msg6186 title: Jython ~300x slower on array.array instance creation -> Jython copies strings instead of using them to back an array |
2010-05-23 00:10:01 | akong | set | nosy:
+ akong |
2010-05-19 12:09:26 | mcieslik | set | messages:
+ msg5771 |
2010-05-19 11:21:17 | doublep | set | nosy:
+ doublep messages:
+ msg5770 |
2010-05-18 18:33:04 | mcieslik | create | |
|