Non Sucking YAML parser
Robin Berjon
robin.berjon at expway.fr
Thu Sep 14 14:50:52 BST 2006
On Sep 14, 2006, at 15:18, David Cantrell wrote:
> On Thu, Sep 14, 2006 at 12:21:49PM +0200, Robin Berjon wrote:
>> [XML namespaces]
>
> Perhaps I'm being dense, but what is the point of namespaces in XML?
Similar to namespaces in, say, Perl. Identification and separation.
You can get away with not using namespaces for local cases concerning
closed systems in the same way that you can get away with putting
everything in main for a script the code of which is not going to be
reused.
If however you're mixing and matching, or talking across boundaries,
you really want to keep your kittens easily split. And the sorting
kitten is happy to know the difference between Buffy and a pony
called Buffy.
> If
> your code comes across a document it can't deal with cos it's - say -
> describing a hospital visit instead of a book - then it's going to
> barf.
> Code dealing with XML must know a lot about the documents anyway,
> so all
> that extra verbiage is pointless. Say 'title' instead of
> <hospitalvisit:patientdetails:title>Mr</crap>* and <book:title>Winnnie
> The Pooh</waffle>.
Yes, that verbiage is indeed pointless. Daft people do pointless
things, what can I say? Should the namespaces spec have a big red
blinking "DON'T BE FUCKING STUPID" sign at the top? Would it help?
Presumably a title element doesn't happen all on its own. So in the
above two cases you'd see:
<hospital-visit xmlns='http://hospital...'>
<patient-details>
<title>Dahut</title>
....
</patient-details>
....
</hospital-visit>
and
<book xmlns='http://book...'>
<title>Wild Left Dahut Pie</title>
....
</book>
Do you want to keep track of the books that your patients have read
during their stay? There are two ways of doing that while still
preserving enough information to know which title is which.
Option one, give context (and let's assume there aren't any
namespaces, just for fun):
<hospital-visit>
<patient-details>
<title>Dahut</title>
....
<book>
<title>Wild Left Dahut Pie</title>
</book>
<book>
<title>Ponies From Hell</title>
</book>
</patient-details>
....
</hospital-visit>
With that, if you want all book titles read by all patients, you can
search for //book/title. Likewise, if you want to know how many Lords
have been your patients, you can go //patient-details[title = "Lord"]
and you won't pick up books called "Lord".
Option two, use namespaces:
<hospital-visit xmlns='http://hospital...' xmlns:b='http://book...'>
<patient-details>
<title>Dahut</title>
....
<b:title>Wild Left Dahut Pie</b:title>
<b:title>Ponies From Hell</b:title>
</patient-details>
....
</hospital-visit>
So, which is most verbose? Also, note that now getting all the book
titles is just //b:title.
There isn't a single day that I don't edit XML, of many different
kinds. I do it for work, I do it for play. I always use namespaces,
and only rarely have to resort to using prefixes. The fact that
people do so just shows that they're clueless — cause I sure ain't
specially smart.
> If instead your code is meant to handle generic documents and not
> really
> understand them - perhaps it is code to traverse an arbitrary document
> element tree, or to store a document in a database - then it doesn't
> need to know about the namespaces anyway, nor can it be expected to
> usefully compare documents, so again they're pointless.
The ability to apply processing to just parts of a document, most of
which you might not understand but parts of which you do is extremely
useful. And it's impossible to do reliably without namespaces — the
alternative being to use absurdly verbose element names in the hope
that they won't clash. Dispatching to different processors based on
the root element's namespace is also very useful.
> Frankly, I want to take everyone who has been involved in speccing any
> version of XML since the first one and SPANK THEM HARD. All the extra
> unnecessary crap makes XML harder to process both by machines and by
> humans.
Huh? You say versions of XML but you don't sound like you're talking
about XML 1.1, or the various editions of 1.0 and 1.1 that fixed
bugs. There have been horrible, horrible specs that use or apply to
XML, like XML Schema and SOAP, but they have nothing to do with XML.
Me, I sympathise with the poor sods who have to handle them, but I
just ignore them — as many happily. There are many fine and simple
specs out there, Namespaces, XPath, XSLT, RelaxNG, SVG...
Why bother with the extra unnecessary crap? I don't. If you have to,
blame yourself, or your job.
--
Robin Berjon
Senior Research Scientist
Expway, http://expway.com/
More information about the london.pm
mailing list