Message3869

Author noelob
Recipients noelob
Date 2008-12-03.15:54:38
SpamBayes Score 7.419531e-05
Marked as misclassified No
Message-id <1228319679.37.0.0593723220729.issue1189@psf.upfronthosting.co.za>
In-reply-to
Content
When using pythons md5 library to create a hex digest of a string
containing non-latin characters, an incorrect hash is returned. The
following code shows the difference:

# encoding: uft-8

from java.math import BigInteger
from java.security import MessageDigest
from java.security import NoSuchAlgorithmException
from java.lang import String
import md5

a = u"Gráin amháiñ ©ðƒ©óíßðƒóíıßðƒ‚íó©ı"
#a = "A lovely string to encrypt!"
b = String(a)

digest = MessageDigest.getInstance("MD5")
bytes = b.getBytes()
digest.reset()
java_hash = BigInteger(1, digest.digest(bytes)).toString(16)
print "Hash using Java:\t", java_hash

print "Hash in Python:\t\t", md5.new(a).hexdigest()
History
Date User Action Args
2008-12-03 15:54:39noelobsetrecipients: + noelob
2008-12-03 15:54:39noelobsetmessageid: <1228319679.37.0.0593723220729.issue1189@psf.upfronthosting.co.za>
2008-12-03 15:54:39noeloblinkissue1189 messages
2008-12-03 15:54:38noelobcreate