Issue1612
 
            
            
            
Created on 2010-05-18.18:33:04 by mcieslik, last changed 2015-01-14.00:48:23 by santa4nt. 
 |
 
   | msg5769 (view) | Author: Marcin (mcieslik) | Date: 2010-05-18.18:33:03 |  |  
   | It takes ~ 300x longer to create instances of array.array in Jython2.5.1 vs Python2.6 and Python3.1
e.g. the following: 
from array import array
array('b', large_string)
$ python2.6 profile_array.py 
0.0104711055756
$ python3.1 profile_array.py 
0.00699281692505
$ jython profile_array.py 
3.00600004196
$ jython --version
Jython 2.5.1 |  
   | msg5770 (view) | Author:  (doublep) | Date: 2010-05-19.11:21:16 |  |  
   | Did you measure total program time? |  
   | msg5771 (view) | Author: Marcin (mcieslik) | Date: 2010-05-19.12:09:26 |  |  
   | The 3s of jython profile_array.py do **NOT** include the JVM start-up time, so it is 'wall-clock' time of the loop.
this is what is in the attached script:
start = time()
for i in range(10000):
    array('b', large_string)
stop = time() |  
   | msg6186 (view) | Author: Jim Baker (zyasoft) | Date: 2010-10-17.17:21:48 |  |  
   | The problem here is that we copy the string. In 2.6 this can be avoided by supporting a string to back an array. This can (and should) be part of a general support for memoryview. |  
   | msg6187 (view) | Author: Jim Baker (zyasoft) | Date: 2010-10-17.17:24:14 |  |  
   | better title - "Jython ____" is just noise here |  
   | msg9375 (view) | Author: Jim Baker (zyasoft) | Date: 2015-01-12.16:10:34 |  |  
   | The reported performance problem is still seen in 2.7.0 beta 4.
In reviewing CPython 2.7's arraymodule.c, I don't see any support for copy-on-write semantics to do this speedup. Instead it's just a straightforward memcpy in the frombytes function. |  
   | msg9376 (view) | Author: Jim Baker (zyasoft) | Date: 2015-01-12.17:35:18 |  |  
   | So the additional overhead here has a simple root cause: unlike CPython, Jython uses the same method, PyArray.fromStream, to read from an input stream into a given array. Although the read should be reasonably fast/inlineable (but more overhead than simply looping through the string), the write performance into the array is very slow since it uses java.lang.reflect.Array, in this case java.lang.reflect.Array#setByte.
Some simple specialization would speed things up considerably, much as was done with CPython.
Changing misleading title! (Copy-on-write would still be interesting, and perhaps more feasible on Jython.) |  
   | msg9381 (view) | Author: Santoso Wijaya (santa4nt) | Date: 2015-01-14.00:48:21 |  |  
   | @zyasoft Something like the patch I have in mind? I can get a better profile number with this naive "bulk" put() implementation sans-copy-on-write optimization, but it's modest at best. |  |
 
| Date | User | Action | Args |  | 2015-01-14 00:48:23 | santa4nt | set | files:
  + issue1612.patch keywords:
  + patch
 messages:
  + msg9381
 |  | 2015-01-13 19:15:58 | santa4nt | set | nosy:
  + santa4nt type: behaviour
 |  | 2015-01-12 17:35:18 | zyasoft | set | messages:
  + msg9376 title: array.array copies strings instead of using them to back the new array -> array.array should use specialized bulk operations to initialize from an input source, such as a string
 |  | 2015-01-12 16:10:35 | zyasoft | set | messages:
  + msg9375 |  | 2015-01-12 07:36:56 | zyasoft | set | resolution: remind |  | 2013-02-26 17:33:07 | fwierzbicki | set | nosy:
  + fwierzbicki |  | 2013-02-25 19:04:22 | fwierzbicki | set | versions:
  + Jython 2.7, - 2.5.1 |  | 2010-10-17 17:24:15 | zyasoft | set | messages:
  + msg6187 title: Jython copies strings instead of using them to back an array -> array.array copies strings instead of using them to back the new array
 |  | 2010-10-17 17:21:49 | zyasoft | set | priority: low nosy:
  + zyasoft
 messages:
  + msg6186
 title: Jython ~300x slower on array.array instance creation -> Jython copies strings instead of using them to back an array
 |  | 2010-05-23 00:10:01 | akong | set | nosy:
  + akong |  | 2010-05-19 12:09:26 | mcieslik | set | messages:
  + msg5771 |  | 2010-05-19 11:21:17 | doublep | set | nosy:
  + doublep messages:
  + msg5770
 |  | 2010-05-18 18:33:04 | mcieslik | create |  | 
 |