DBIx::Class - Related Tables

Tue Oct 7 11:37:19 BST 2008

Dave Hodgkinson wrote:
> Then what's the benefit of an ORM? (general question, not just to you :)

The answer is abstraction.  What was the question?  :-)

Actually, there's no standard set of benefits because there no such thing
as a standard ORM.  It has come to be used as a catch-all term for any code
that talks to your database so you don't have to.

Typical features of ORM-alike modules are:

    * Schema definition - so that the ORM code can make assumptions about
      what fields to put/get to and from the database tables, what keys are
      used to identify records and so on.  Some ORMs can create tables from
      this schema, others can intuit the schema from existing tables.

      The key thing here is abstraction - your schema is written in a generic
      language that you can inspect programmatically and translate to specific
      formats (e.g. to accommodate differences between MySQL, Postgres, etc)

    * SQL generation - with the above information, an ORM can generate SQL
      statements to perform simple insert/update/delete/select queries.

        $zoo->insert(
            animal => 'Badger',
            plays  => 'Tennis',
	   eats   => 'Nuts and Berries',
        );

      The key thing here is abstraction.  You're saying that you want to
      store the information in the zoo, without having to worry about the
      specific details of how that is done.  Apart from the obvious benefit
      of not having to litter your code with SQL, it means you can easily
      change the details of the implementation at a later date if you need
      to.

    * Row data <--> object mapping.  The "true" purpose of an ORM is to
      handle the mapping between relational data and programming objects.
      The generally accepted wisdom here is that there will always be an
      impedance mismatch between the OO and Relational paradigms so don't
      expect a faithful translation.

        my $animal = $zoo->fetch( animal => 'Badger' );
        print 'Badger plays ', $animal->plays;

      The key thing here is, you guessed it, abstraction.  You let the ORM
      handle the messy business of doing the mapping so that you don't have
      to litter your code with it.  You get to work with familiar objects in
      "programming space" and effectively forget about how the information
      that they're representing and manipulating is store persistently.

    * Relation mapping.  An ORM can represent the relations between tables
      and hide all the complexity of gnarly JOIN statements behind simple
      object methods.

          foreach $track ($album->tracks) {
               ...
          }

      Yes, our friend abstraction is here again.  You don't need to know
      if the relation is implemented as a back-link from track to album
      (e.g. album.id <- track.album_id) or via an album_tracks link table
      (album.id <- album_track.album_id + album_track.track_id -> track.id),
      and (in theory) you don't need to worry about any changing the structure
      of the database at some point in the future, because you'll only need
      to update the relation specification used to generate the tracks()
      method.

Hiding implementation details is (usually) a Good Thing.  The problem is
that in practice, the more you hide the database, the harder it is to use
its full power.

> I'm a big fan of an application-specific wrapper over the d/b that
> does what you need, so in Dave's case a get_highest_version() sub
> with the SQL inside. Is that now passé?

Me too.  When I'm writing my application code I want to talk in business
logic.  I specifically *don't* want to know anything about how my data is
stored persistently, be it in a database, YAML files, or encoded in the
arrangement of chocolate sprinkles on a slice of cake.

The approach that I have found works best (for me at least) is to have
one module for each table in the database (subclassed from a generic
DB::Table module), and another for each record (subclassed from a generic
DB::Record module).  using one class for both table and record is broken,
IMHO.

The table module defines a schema, implements basic insert, update, select
and delete methods (using auto-generated SQL).  The record module
(or ActiveRecord if design pattern parlance is your thing) is used to
represent each instance and has table-specific methods auto-generated.
The nice thing is that you can easily add your own custom methods to either
table or record class, so it allows you to align them closer to your
business logic.

The final part of the equation is an ActiveRelation class (hierarchy actually
- to represent various different relation types). That's where things start
to get, um, interesting.

In practice, this approach takes a little longer because you're building a
bespoke data abstraction layer each time instead of using a more generic
solution (although the dividing line is thin and often disappears when you
look at it directly). However, this is paid back in triplicate when it come to
writing the application level code because everything is generally, much
simpler.  It's worth the extra time and effort up front for major projects,
but can be overkill for smaller projects, or quick scripts to jumble some
data around.

On the other hand, using a more generic ORM (like DBIC and friends), means
that you can get up and running quicker and usually get by with writing less
custom code.  You just have to be a bit more tolerant about bits of database
interface jagging into your application code.

As usual, TMTOWTDI.

A

P.S. Badger::Database coming to a forest near you soon!