Non Sucking YAML parser

Robin Berjon robin.berjon at
Thu Sep 14 11:21:49 BST 2006

On Sep 14, 2006, at 12:07, Dirk Koopman wrote:
> On Thu, 2006-09-14 at 10:21 +0100, Jonathan Stowe wrote:
> There are two issues here:
> 1. Public interfaces. Although I hate XML with a passion (especially
> when someone is being verbose and insists that you are as well), I can
> see its utility - at least for "occasional" low bandwidth usage (say
> below 5 requests/sec).

That's the whole point, XML is made for interchange. Coming up with a  
balkanisation of other formats doesn't help as they will exhibit a  
different set of problems.

That's the whole point behind the efficient XML effort: coming up  
with a single alternative format (instead of the dozens we have  
today). That format will hopefully have a number of properties  
different from those of XML, while offering full round-trippability  
from one to the other. That would allow one to optimise verbosity (by  
a huge factor in many cases) while only swapping the parsers/ 
serialisers around (and keeping all the code on top of them  
unchanged). Keep your fingers crossed :)

> 2. Protocols. Here XML is a pain. It doesn't stream without kicking or
> otherwise fooling the parser into thinking that each "paragraph" is,
> fact, a "document". This is compounded by the fact that most people  
> who
> use XML for protocols insist on sending all the descriptive stuff
> (DOCTYPE, xmlns etc etc) on each paragraph. They also tend to use huge
> tagnames and don't factorise. Hence my jibes about signal/noise  
> ratios.

Sending a DOCTYPE on every fragment is daft. In fact, using a DOCTYPE  
ever, except for XHTML which alas requires it, is silly. If however  
you want to make a parser believe that a subtree is a document, you  
really do need to send the namespaces. They're not descriptive,  
they're part of the names of the elements.

If you want an example of people trying not to be verbose with XML,  
you might want to check out FixML. I don't think that any of the  
local names is an actual English word, most are really shortened  
down. Also, they get as much mileage as they can from attributes. I'm  
not saying it's necessarily the ideal approach, but if you're  
interested in that topic it's worth checking out.

Robin Berjon
    Senior Research Scientist

More information about the mailing list