Title: datetime.strptime('%b') behaviour inconsistent with CPython on Windows JDK 9+
Type: behaviour Severity: normal
Components: Library Versions: Jython 2.7
Status: open Resolution:
Dependencies: Superseder:
Assigned To: Nosy List: adamburke
Priority: Keywords:

Created on 2019-01-08.07:46:47 by adamburke, last changed 2019-03-15.10:54:54 by adamburke.

msg12290 (view) Author: Adam Burke (adamburke) Date: 2019-01-08.07:46:46
It seems there are a few datetime.strptime() / strftime() issues, but this seems like a new variation.

On JDK 9+, Windows 10, the short name for months has changed even for English locales to include a fullstop. This causes incompatibility with CPython and earlier versions of Jython, and breaks regtests, mainly those inherited from CPython itself.

Regrtest sample output
> dist\bin\jython.exe -m test.regrtest -v test_strptime

ERROR: test_feb29_on_leap_year_without_year (test.test_strptime.StrptimeTests)
Traceback (most recent call last):
  File "C:\Users\Adam\jython\jython3\dist\Lib\test\", line 382, in test_feb29_on_leap_year_without_year
    time.strptime("Feb 29", "%b %d")
  File "C:\Users\Adam\jython\jython3\dist\Lib\", line 467, in _strptime_time
    return _strptime(data_string, format)[0]
  File "C:\Users\Adam\jython\jython3\dist\Lib\", line 324, in _strptime
    raise ValueError("time data %r does not match format %r" %
ValueError: time data u'Feb 29' does not match format u'%b %d'

ERROR: test_mar1_comes_after_feb29_even_when_omitting_the_year (test.test_strptime.StrptimeTests)
FAIL: test_pattern (test.test_strptime.TimeRETests)

Demonstration script:
import os
import time
from datetime import date

print (os.uname())
print (os.environ['JAVA_HOME'])

print ( date(2002, 2, 4).strftime('%b %d')  )

print ( time.strptime("Jan. 29", "%b %d") ) # No ValueError
print ( time.strptime("Jan 29", "%b %d") ) # ValueError

('Windows', '...', '10', '...', 'AMD64')
C:\Program Files\Java\jdk-11.0.1
Feb. 04
time.struct_time(tm_year=1900, tm_mon=1, tm_mday=29, tm_hour=0, tm_min=0, tm_sec=0, tm_wday=0, tm_yday=29, tm_isdst=-1)
Traceback (most recent call last):
  File "", line 13, in <module>
    print ( time.strptime("Jan 29", "%b %d") ) # ValueError
  File "C:\Users\Adam\jython\jython3\dist\Lib\", line 467, in _strptime_time
    return _strptime(data_string, format)[0]
  File "C:\Users\Adam\jython\jython3\dist\Lib\", line 324, in _strptime
    raise ValueError("time data %r does not match format %r" %
ValueError: time data u'Jan 29' does not match format u'%b %d'
msg12292 (view) Author: Adam Burke (adamburke) Date: 2019-01-08.07:48:54
Tested on JDK 9 and 11. Causes some related failures across a few reg tests.
msg12294 (view) Author: Adam Burke (adamburke) Date: 2019-01-08.07:51:10
msg12358 (view) Author: Adam Burke (adamburke) Date: 2019-03-15.10:54:53
The local month formatting behaviour is defined at the lowest possible level, for CPython in timemodule.c, calling into either wcsftime(), or the relevant OS-specific library.

In jython there is a reimplementation in org/python/modules/time/ which calls into the Java standard libraries, understandably enough.

The definition of %b, is "Month as locale’s abbreviated name".

One practical if slightly work-avoiding position would be to say jython locales *are* Java locales. This would be a more defensible position if jython locale stuff wasn't weak (as generally admitted), eg if setlocale() wasn't broken (also noted by Wang Yaqiang)

Traceback (most recent call last):
  File "", line 10, in <module>
    locale.setlocale(locale.LC_ALL, 'de_DE')
  File "C:\Users\Adam\jython\jython3\dist\Lib\", line 552, in setlocale
    return _setlocale(category, locale)
  File "C:\Users\Adam\jython\jython3\dist\Lib\", line 88, in setlocale
    raise Error, '_locale emulation only supports "C" locale'
ValueError: _locale emulation only supports "C" locale

OTOH, you have to fix one thing at a time, and maintaining a Java-C locale mapping sounds like a thankless and moving-target task.

As a side note for anyone fishing in this area, jython has a fork of the datetime module, which doesn't do much except override add a few __tojava__() methods. Reorganizing that in an include / override structure would be good (but not actually address this issue at all).
Date User Action Args
2019-03-15 10:54:54adamburkesetmessages: + msg12358
2019-01-08 07:51:10adamburkesetmessages: + msg12294
2019-01-08 07:48:54adamburkesetmessages: + msg12292
2019-01-08 07:46:47adamburkecreate