Sharding, and all that

Paul Makepeace paulm at paulm.com
Fri Dec 19 14:12:19 GMT 2008


On Fri, Dec 19, 2008 at 1:43 PM, Richard Huxton <dev at archonet.com> wrote:
>
> Andy Wardley wrote:
> > Richard Huxton wrote:
> >> Yep - that's what "sharding" is all about - separate disconnected silos
> >> of data.
> >
> > I thought sharding specifically related to horizontal partitioning.  i.e.
> > splitting one table across several databases,  e.g. records with even row
> > ids in one DB, odd in another.  Apologies if my terminology is wrong.
> >
> > I was thinking more specifically about vertical partitioning along the
> > functional boundaries which wouldn't be sharding by my (possibly incorrect)
> > definition.  Apologies for being off-topic, too  :-)
>
> If "sharding" means anything at all, then it has to be something other
> than partitioning or partial replication, otherwise we could say
> "partitioning" or "partial replication". Of course it's entirely
> possible it *doesn't* mean anything at all, and is just partitioning2.0

Sharding is horizontal partitioning: splitting your data by a primary
key, for example having a few hundred million email boxes spread over
thousands of machines keyed by email address.

Having your user accounts data in a separate database from say those
users' preferences is a form of vertical partitioning. Both in a big
system are very likely to be sharded by some kind of internal key.

P


More information about the london.pm mailing list