Non Sucking YAML parser

Mark Overmeer mark at overmeer.net
Wed Sep 13 17:13:37 BST 2006


* Robin Berjon (robin.berjon at expway.fr) [060913 15:40]:
> On Sep 13, 2006, at 17:20, Mark Overmeer wrote:
> >If you pick XML, you probably want to write a specification for
> >your data structures.  Then, you may consider to use XML schema's.
> 
> I would recommend the exact opposite: if you have any chance of  
> avoiding having to use XML Schema anywhere, then run away like hell.  
> It is by far the worst specification ever produced by the W3C. It is  
> a world of pain.

The specs are horrible, for non-native speakers even worse.  I have
no problems with RFCs, but these W3C papers are unreadible.

> The "wide acceptance" in the business world is largely limited to  
> either data binding for very limited enterprise scenarios (where the  
> schemata are machine generated based on Java or .NET stubs) or to  
> wanking about schemata that are never used in practice because  
> they're so painful. A Belgian study released last year showed that  
> *70%* of those schemata are not even valid as per the spec — you can  
> imagine how much that means they're actually getting used.

Yes, for instance in my case: my perl programs talk to a BEA application
which generates WSDL :-(  It's a pity that those applications exist.  Even
the schema schema itself is not validatable... quite bad.  The horrors
of a professional Perl programmer: no choice.

> >take a look at my new module XML::Compile.
> 
> That's not a bad idea at all, but you're up for difficult times I'm  
> afraid. How do you plan on handling the majority of schemata that are  
> broken?

Try to be strict, be kind where possible (for instance with configuration
options).  The module is already close to completion, supporting the
features of the upcoming schema standard (v1.1).  Let's hope that
people give sufficient feedback to get flexible work-arounds in the
module to fix broken things.

> Also, it is extremely common (if not simply the rule) that  
> instances won't always validate, sometimes simply because over a  
> large production array you will have six or seven different versions  
> of the vocabulary in use, and therefore at least as many variants on  
> the schema. How do you handle deviance?

If you handle name-space correctly, it works.  Or at least... it
has a better chance that it will work when both parties do respect
name-spaces.

The advantage of my module is that the application writer does not
need to understand schemas nor name-spaces.  The user only needs to
know the structure of the data as tree of nested hashes. [On my
wishlist is some procedure to generate example trees, to make it
even simpler]
-- 
               MarkOv


More information about the london.pm mailing list