Message6617

Author amak
Recipients amak, amyhlin, amylin
Date 2011-08-26.22:21:16
SpamBayes Score 0.0
Marked as misclassified No
Message-id <1314397276.98.0.347367830491.issue1792@psf.upfronthosting.co.za>
In-reply-to
Content
> Thanks for your quick answer.  We are about to upgrade jython version from 
> 2.1 to 2.5.2 in WebSphere version 8.5 and I am investigating any breaking 
> change (behavior change) when upgrade to v2.5.2.  I think that this is 
> just one of behaviors change.   We want to prevent this since customers 
> may complain the output type change and it may also break customer scripts 
> if they parse the output string .   It can be resolved in wsadmin code, 
> but do you know any other behavior/breaking change in jython 2.5.2 such as 
> built-in function or name space change? 

OK.

I think string types is the only change you need to worry about. But I also think you should post a question to the jython-dev list about other potential issues, to be certain. I'll post that question for you if you wish.

Here is some information about the string changes.

Since python/jython is a "duck typing" language, users who try to carry out string operations on string data types will not notice a difference, because all operations on PyString should work exactly the same on PyUnicode.

e.g.

>>> s = u"hello world"
>>> t = "hello world"
>>> s.encode("iso-8859-1")
'hello world'
>>> t.encode("iso-8859-1")
'hello world'

However, users who are carrying out type checking will have code breakage, e.g.

>>> s = u"hello world"
>>> isinstance(s, str)
False
>>> isinstance(s, unicode)
True

However, one way for them to code around this is as follows

>>> isinstance(s, (str, unicode))
True

Also, they will have breakage if they do this

>>> import types
>>> type (s) is types.StringType
False
>>> type (s) is types.UnicodeType
True

And they should change their code to this

>>> type (s) in types.StringTypes
True

>>> isinstance(s, types.StringTypes)
True

The fundamental problem is cpython compatibility. Jython has *always* done unicode strings, because java strings are unicode. But from a typing POV, we had to be compatible with cpython. That's the only reason why we have separate 'str' and 'unicode' types in jython.

Since I see you work for IBM, you can 

A: Prevent code breakage by converting everything to 'str' before you return it, e.g.

>>> t = str(s)
>>> t
'hello'
>>> isinstance(t, str)
True
>>> isinstance(t, unicode)
False

The 'str' type has all the same capabilities as the 'unicode' type

>>> t.encode('iso-8859-1')
'hello'
>>> dir (s)
['__add__', '__class__', '__cmp__', '__contains__', '__delattr__', '__doc__', '__eq__', '__getattrib
ute__', '__getitem__', '__getnewargs__', '__getslice__', '__hash__', '__init__', '__len__', '__mod__
', '__mul__', '__ne__', '__new__', '__reduce__', '__reduce_ex__', '__repr__', '__rmul__', '__setattr
__', '__str__', 'capitalize', 'center', 'count', 'decode', 'encode', 'endswith', 'expandtabs', 'find
', 'index', 'isalnum', 'isalpha', 'isdecimal', 'isdigit', 'islower', 'isnumeric', 'isspace', 'istitl
e', 'isunicode', 'isupper', 'join', 'ljust', 'lower', 'lstrip', 'partition', 'replace', 'rfind', 'ri
ndex', 'rjust', 'rpartition', 'rsplit', 'rstrip', 'split', 'splitlines', 'startswith', 'strip', 'swa
pcase', 'title', 'translate', 'upper', 'zfill']
>>> dir (t)
['__add__', '__class__', '__cmp__', '__contains__', '__delattr__', '__doc__', '__eq__', '__ge__', '_
_getattribute__', '__getitem__', '__getnewargs__', '__getslice__', '__gt__', '__hash__', '__init__',
 '__le__', '__len__', '__lt__', '__mod__', '__mul__', '__ne__', '__new__', '__reduce__', '__reduce_e
x__', '__repr__', '__rmul__', '__setattr__', '__str__', 'capitalize', 'center', 'count', 'decode', '
encode', 'endswith', 'expandtabs', 'find', 'index', 'isalnum', 'isalpha', 'isdecimal', 'isdigit', 'i
slower', 'isnumeric', 'isspace', 'istitle', 'isunicode', 'isupper', 'join', 'ljust', 'lower', 'lstri
p', 'partition', 'replace', 'rfind', 'rindex', 'rjust', 'rpartition', 'rsplit', 'rstrip', 'split', '
splitlines', 'startswith', 'strip', 'swapcase', 'title', 'translate', 'upper', 'zfill']

B: Force your users to update their code to reflect the new 'unicode' type name. This is a simple code change, and will be easy for them to carry out. If they have adequate unit-testing ;-) This is the option you should select if you want their code to run also correctly under modern cpython, ironpython or pypy. (websphere.Net anybody?)

> I also like to confirm about the jython cache.  I have chatted with Frank 
> Wierzbicki a while ago and he told me that jython 2.2 or higher version 
> does not require to build the cachedir.   We have been complained the 
> wsadmin startup performance and jython *sys-package-mgr* messages shown in 
> console when first use of jython in wsadmin because it takes time for 
> jython to create all packages/jars to cachedir.   Can I simply set 
> "python.cachedir.skip" property in wsadmin code to get rid of building the 
> cache?   Will it cause any problem without building cache during 
> initialization?  For example, if I like to import some java or Websphere 
> package/class in jython. 

The package cache is literally that: a cache. 

There is a necessary process of building meta-data structures for all java packages that will be used with jython. This information *must* be available for jython to be able to use the packages.

Because this can be a time-consuming process, taking up to 10 or 20 seconds, depending on the number of packages to be processed, the information is cached, to speed future jython invocations. The length of time taken depends on the number of packages in the CLASSPATH.

If the caching is disabled, it will just mean slower invocations *every* time, because all of the packages will have to be scanned on *every* startup.

But it will still operate correctly: the only thing that will suffer is startup time.

If the caching is enabled, the scanning takes place once: every future invocation will be quicker because of the caching.

We're straying off your original bug report now: if you have any further questions, please post them to jython-users or jython-dev.

Alan.
History
Date User Action Args
2011-08-26 22:21:16amaksetmessageid: <1314397276.98.0.347367830491.issue1792@psf.upfronthosting.co.za>
2011-08-26 22:21:16amaksetrecipients: + amak, amylin, amyhlin
2011-08-26 22:21:16amaklinkissue1792 messages
2011-08-26 22:21:16amakcreate