Title: urllib2.urlopen() does not close connection on 404 response.
Type: behaviour Severity: major
Components: Library Versions: 2.5.1
Status: closed Resolution: invalid
Dependencies: Superseder:
Assigned To: amak Nosy List: amak, geraldth
Priority: Keywords:

Created on 2010-01-19.13:49:52 by geraldth, last changed 2010-04-16.16:38:13 by amak.

msg5443 (view) Author: Gerald Thaler (geraldth) Date: 2010-01-19.13:49:50
urllib2.urlopen(url) raises an exception on URLs that return HTTP 40x (or 50x) but fails to close the underlying connection. Because the call does not return normally, there is no way for the caller to close the connection himself.

This is a major problem in webspider-applications, as ultimately the system will run out of file descriptors.
msg5718 (view) Author: Alan Kennedy (amak) Date: 2010-04-16.16:38:13
The HTTPError exception returned by urllib2 in this circumstance is also a full response object. You should be able to use this object to close the connection. For example

except HTTPError, he:
    print "Error retrieving that URL: %s" % str(he)
Date User Action Args
2010-04-16 16:38:13amaksetstatus: open -> closed
assignee: amak
resolution: invalid
messages: + msg5718
nosy: + amak
2010-01-19 13:49:52geraldthcreate