Converting an AST back into code

Mon Feb 23 18:19:09 GMT 2009

Robin Berjon wrote:
> I can certainly see how I can brute-force my way out of this, and I'm 
> pretty sure it'll involve some of that. But what I was wondering about 
> is whether there are tricks and good ideas to doing this right/more 
> easily, and evil traps to be wary of.

The two main approaches that I'm familiar with are tree walking (visitor
pattern) and self-evaluation (interpreter pattern).

Tree walking is usually done via double dispatch.  You start off with
a call on the root node, passing a visitor object as an argument.

    $root->accept($visitor);

Each node implements an accept($visitor) methods which does something
like this:

    $visitor->visit_my_kind_of_node($self, @_);

Nodes with children also do something like this:

    $child->accept($visitor)
       for $child in @{ $self->{ children } };

The main reason for doing this is to avoid the classic type switch
code smell:

    if ($node->type eq 'one_kind') {			# not so good
       # ...
    }
    elsif ($node->type eq 'another_kind') {
       # ...
    }

It also gives you a handy object (the visitor) in which you can maintain
any context-dependent state data.

You have to decide how to collect the output generated by the visitor.
You can have the visitor's visit methods return the results (which
can get a bit tricky when it comes to collation), or you can have
the visitor collect the information itself.

    # either
    my $result = $root->accept($visitor);

    # or
    $root->accept($visitor);
    my $result = $visitor->result;

The visitor pattern usally works best when you've got things like document
object models that you'll want to walk, manipulate and jiggle about with in
many different ways.  It allows you to collect all the methods that relate
to a particular behaviour in one place (i.e. the visitor) instead of dotting
them over umpty different node classes).  This is good when it comes to adding
new views because you don't need to make any changes to the nodes.

The other approach is to give each node type a method (or methods) which
implements the behaviour that you want.  This is typically used for program
optrees where the common behaviour is "evaluate thyself".  You may need a
separate context object to maintain the program state.

    my $result = $root->evaluate($context);

For example, a node representing an addition operator would do something
like this:

    sub evaluate {
        my ($self, $context) = @_;
        my $lhs = $self->lhs->evaluate($context);
        my $rhs = $self->rhs->evaluate($context);
        return $lhs + $rhs;
    }

And a node for fetching a variable value might do this:

    sub evaluate {
        my ($self, $context) = @_;
        return $context->variables->{ $self->name };
    }

This approach can be slightly faster than the visitor pattern as you're
avoiding the extra method call from double dispatch in principle, although
in practice, it rarely makes much difference in the long run, particularly
if your nodes are delegating to a separate context object (so you might as
well make the context and visitor one and the same).  The downside is a lack
of flexibility as all methods must be "built in" to the  nodes themselves.

However, don't forget that Perl's packages are open (i.e. extensible by other
modules) so it's not hard to install new methods into  existing classes if you
want to.

     package My::Tree::View::HTML;

     sub My::Node::Type1::to_html {
         ...
     }

     sub My::Node::Type2::to_html {
         ...
     }

     sub My::Node::Type3::to_html {
         ...
     }

     ...etc...

     use My::Tree::View::HTML;
     my $html = $root->to_html;

This approach might be considered a bit more of a hack because it doesn't have
a design pattern named after it  But it does give you the flexibility of the
visitor approach with the simpler and faster direct evaluation approach.

HTH
A