Issue1772134

classification
Title: patch for [ 1758312 ] date and time objects fail to pickle
Type: Severity: normal
Components: None Versions:
Milestone:
process
Status: closed Resolution: accepted
Dependencies: Superseder:
Assigned To: Nosy List: cgroves, yole
Priority: normal Keywords: patch

Created on 2007-08-11.09:23:43 by yole, last changed 2007-08-12.22:02:05 by cgroves.

Files
File name Uploaded Description Edit Remove
pickle_datetime.patch yole, 2007-08-11.09:23:43
pickle_datetime1.patch yole, 2007-08-11.10:15:12
datetime_unpickle_basestring.patch cgroves, 2007-08-12.06:33:22
Messages
msg2782 (view) Author: Dmitry Jemerov (yole) Date: 2007-08-11.09:23:43
The datetime pickling previously failed for the following reason:
- datetime constructor checks if it was passed a string, and uses special code to deserialize its state from the string (which it generates in its __reduce__ method)
- pickle.py uses decode("string-escape") to unescape data loaded from pickle stream
- string.decode() used to always return PyUnicode, so the type check in the datetime constructor didn't pass.

The patch changes string.decode() to return PyObject, and changes the string-escape codec to return PyString rather than PyUnicode. In addition to datetime pickling, it enables some pickletools tests to pass.
msg2783 (view) Author: Dmitry Jemerov (yole) Date: 2007-08-11.10:15:15
Sorry, forgot to include test enabling into patch. Attaching updated patch.
File Added: pickle_datetime1.patch
msg2784 (view) Author: Charlie Groves (cgroves) Date: 2007-08-12.06:33:22
My understanding is that decode should return unicode objects since it's turning an encoded sequence of bytes into unicode characters.  However, I don't see why datetime should care if it gets a str or a unicode object for unpickling.  So I'd prefer to fix this with the patch I'm attaching.  

Instead of changing all of the decode methods to return a PyObject, I just changed all of the isinstance checks for str to use basestring instead.  str and unicode both descend from it.  Sound good?
File Added: datetime_unpickle_basestring.patch
msg2785 (view) Author: Dmitry Jemerov (yole) Date: 2007-08-12.07:03:38
Charlie,

No, I don't think that patching datetime is the right thing to do, for reasons both general and specific. The general reason is that every time we have to patch the Python stdlib, this represents an incompatibility between CPython and Jython, and as far as I understand, it's better to be as compatible as possible. 

The specific reason is that some codecs do indeed convert byte strings to byte strings and back. http://www.python.org/doc/2.3.5/lib/node130.html says:
"A number of codecs are specific to Python, so their codec names have no meaning outside Python. Some of them don't convert from Unicode strings to byte strings, but instead use the property of the Python codecs machinery that any bijective function with one argument can be considered as an encoding."

string-escape, which is at issue here, is one such codec.
msg2786 (view) Author: Charlie Groves (cgroves) Date: 2007-08-12.22:02:05
I definitely share your concern about modifying things from CPython's stdlib.  Everything in Lib copied over from CPythonLib has to be checked for improvements from its CPython version when jython is upgraded.  That ends up being a pretty significant burden.  In the specific case of datetime, it's not as big a deal.  datetime is implemented in C in CPython, so datetime.py is it's own, jython specific, beast.  We only have to be concerned with breaking compatibility with CPython, not carrying modifications forward.

However, your specific reason still applies and makes the reason for my patch moot.  I applied the patch in r3412.

History
Date User Action Args
2007-08-11 09:23:43yolecreate