Message6591

Author amak
Recipients amak, pjac
Date 2011-07-30.00:32:50
SpamBayes Score 7.430652e-07
Marked as misclassified No
Message-id <1311985971.17.0.441843902441.issue1774@psf.upfronthosting.co.za>
In-reply-to
Content
This is fundamentally an interpretation issue. 

Does one interpret an empty document as a failure to provide parsable tokens from the input stream (the java interpretation, i.e. the tokenizer raises the error) or does one interpret an empty document as a stream of tokens that is empty (the python interpretation, i.e the parser raises the error)?

Is there an xml declaration present in the file? i.e. does the stream contain something like "<?xml version="x.y" encoding="blah_encoding"?>"

Or is the input stream completely empty, i.e. contains no characters other than whitespace?

If the latter, i.e. the document is pure whitespace, then I recommend a pragmatic solution, i.e.

document = document.strip()
if document:
    xml_parse(document)
else:
    raise MyException("An whitespace document is meaningless, no matter what its file extension is")

In the meantime, I will investigate whether an empty file or a file full of whitespace can meaningfully be described as an XML file.
History
Date User Action Args
2011-07-30 00:32:51amaksetmessageid: <1311985971.17.0.441843902441.issue1774@psf.upfronthosting.co.za>
2011-07-30 00:32:51amaksetrecipients: + amak, pjac
2011-07-30 00:32:51amaklinkissue1774 messages
2011-07-30 00:32:50amakcreate