Multi platform, high volume data recording

Lyle - CosmicPerl.com perl at cosmicperl.com
Sun Nov 11 15:29:09 GMT 2007


Simon Wilcox wrote:
> Hi Lyle,
>
> Lyle - CosmicPerl.com wrote:
>> I'd appreciate some heavy criticism (without swearing or personal 
>> insults) and suggestions. Although some things will not be changed, 
>> for instance...
>
> Flash is an interesting choice of presentation medium. I particularly 
> loath that I can't link directly to any of your slides. A great 
> example of why Flash sucks from a usability point of view.

It was rather a pain, what would you use instead?

> Anyway, regarding the content I'd make four comments:
>
> 1. Multiple A records. We've been doing some testing on a setup just 
> like this over the last couple of weeks and so far we've found that 
> "automatic" is something of a misnomer. The current browser crop do 
> NOT appear to automatically retry the 2nd IP address if the first goes 
> down. Some data is always lost but when the user clicks refresh the 
> browser does pick up the second address. So it appears to be good for 
> automatic failover but it's not seamless and not error free. It's 
> certainly better than not having a second datacentre though.

Not perfect, but from all that I read, the only way to do it without a SPOF

> 2. Fault tolerance. By having just one logging server receiving and 
> processing data at one time you're introducing a single point of 
> failure. If it goes down all the data collected goes with it.

On the later diagram it shows 2 recording at a time, although this was 
intended for splitting the data across multiple recording servers. The 
idea is that they'd be running RAID 1 or 10, in in the event of failure, 
the data would just be delayed while the server was brought back up. 
Taken that it would be just X mintues of data I didn't see it as a huge 
issue.

> 3. DB scalability. MySQL only supports 1 master per database and given 
> that this is primarily a data writing application you will need to 
> consider partitioning your data into multiple databases so that you 
> can write to multiple servers. You will also need to consider how you 
> migrate the write master to another server if the primary server goes 
> down. Load balancers are quite good for this too.

 From reading how MySpace do it that was/is my plan. I didn't want to go 
into too much detail as the slides were getting long.

> 4. Commercial reality. Your diagram shows 12 servers in 2 datacentres. 
> That's what, maybe a £30-40k spend on hardware, more if you virtualise 
> with shared storage and £2k a month in rack rental and transit. Plus 
> the admin cost of running & supporting that which is another £6-7k a 
> month. I don't know what you're planning to charge for this software 
> but if I was on the buying side, and I have been, there's no way we 
> would spend that kind of money on a one-man band almost no matter how 
> good the software was. People spending that kind of money tend to be 
> cautious.

The client already has hardware it's likely they'll want to use that. If 
they wanted me to host it then there would be additional charge to 
reflect it. The diagram shows how the software must be able to scale. 
It'll also be able to run on a single server, other servers can be 
brought in as needed. So it won't be a bang you've gotta spend £££££ to 
run this. I've recently been negotiating with a sysadmin that's going to 
come on board to manage the server side of things. LoL, maybe I should 
say FO to those people that thought I wasn't taking things on board?

> Or is this your plan for the hosted solution, in which case you *will* 
> need to hire a sysadmin to help run that lot. You alone will not be 
> able to write the software, build those servers, run the system and, 
> most importantly, sell the solution no matter how good you are. Been 
> there, done that. You *will* fail if you try it.

My role is going to be Programmer/SysAdmin/Business man in that order. 
The guy I'm bringing on board is going to be SysAdmin/Business 
man/Programmer in that order. There will also be a graphic/web designer. 
So the initial team will be 3. Obviously if the client wants us to setup 
and manage the servers the cost will be adjusted to reflect that. I'll 
be writing most of the software myself, and possibly bring in affordable 
free lancers to meet the deadline.

> BTW - out of curiosity, what's your USP compared to Google Analytics 
> and WebTrends ?

Just encase any of my competitors start reading this I'm not going to go 
into to many of the specifics until it's launched.

> Good luck with the project, it's certainly "ambitious" as Sir Humphrey 
> would say :-)

Thanks, I'm certainly going to need it.


Lyle


More information about the london.pm mailing list