Brown trousers time :~
paulm at paulm.com
Mon Oct 8 14:48:15 BST 2007
On 10/8/07, Dirk Koopman <djk at tobit.co.uk> wrote:
> Lyle - CosmicPerl.com wrote:
> > Hi All,
> > Soon I'll be embarking on the largest project I've ever undertaken. It's
> > going to be 10's of thousands of lines of code. It needs to be perfect
> > (or damn close :))
> Best of luck. You'll need it. If you can't write / test / debug high
> kwalitee perl code quickly yourself then hire someone who can (or can
> teach you to do it). You will save 1000's and a lot of heartache.
> > Part's of the software need to be able to withstand large volumes of
> > traffic... I'm talking about 100's, 1000's or even 10,000's of clicks
> > per second!
> You need to understand, intimately, how to speed up webserving perl (or
> other scripting languages du jour). Personally: I would avoid using
> mod_perl (or even apache) like the plague (except, possibly, as front
> end cache machines). I much prefer any of the small threaded/select
> based webservers (eg lighttpd, litespeed, thttpd etc) with a FastCGI
> back end.
> The mod_perl site does have a number of extremely important articles
> about how to go about planning something like this. Don't do *anything*
> until you have read and *understood* what is there.
> You will find that one of the tradeoffs you will need to make is RAM v
> webserver processes/threads. The articles above explain that. Once you
> have understood that thoroughly, then go back and look at something like
> lighttpd/litespeed/thttpd + FastCGI and you will (at least) understand
> where I am coming from (even if you don't end up agreeing).
I would say avoid mod_perl and go directly to FastCGI + lightweight
front-end, or at least a stripped-down Apache. Unless you're doing
interesting stuff with other parts of the request cycle, mod_perl just
gets you headaches.
> > This all has me thoroughly bricking it :~
> Welcome to the club :-)
> > From what I've learned it'll have to be mod_perl handling the heavily
> > traffic parts of the software. Basically CGI scripts that open a
> > database connection, read data, then write data and redirect the browser.
> One of things that you really, really should try to achieve is to make
> as much static as possible. Even if that means using (and reusing) acres
> of disc space - just for html cache. Most so called "dynamic" sites
> aren't at all. Take a shopping site, the only things that change on a
> product page are the price (and possibly things like stock levels). But
> these don't change that often. You can generate the page, on demand, and
> then cache it, you have a system that invalidates the page when
> something on it changes. It is rather web 1.0 but it works and is as
> quick as you can serve that html page.
> Remember that the overriding usage on even the most "interactive"
> website is GET not PUT (or GET equiv).
> > From all my searching I have a few questions yet unanswered, I'm hoping
> > you guys can help...
> > I'm concerned that I'll have to quickly write some C libraries for the
> > heavy traffic parts, the book I've found referenced most is "Embedding
> > and Extending Perl", is this the best book to get? Or do you guys
> > recommend others? Or do you recommend other books to get along with this
> > one?
> Don't do this. Here be many dragons. I doubt that there is anything
> perlish that will require this. However you may find yourself (as I did)
> writing plugins for your webserver du jour to manage the cache
> invalidation stuff. But you do that once your understand what it is that
> you are trying to do and have a working webserving cloud to do it with.
> > What's the mod_perl equivalent in Win32? I'm guessing PerlScript in ASP,
> > but is that faster? I can't find any benchmarks.
> Windows? Do you really want to do this in perl? (sorry to be
> controversial). You would be much better off taking the M$ 10 cents and
> doing it whatever is the "M$ way" this week.
I'd agree. There's no shame in using C# on ASP/.NET, it's a good
language, runs fast, has excellent development tools (FAR better than
perl's), has had orders of magnitude more massive deployments, and has
a ton of support, both free & commercial. I'd question perl as a
choice here, frankly.
> > Would it be best to have separate databases (all in MySQL) for different
> > parts of the program? So that the database tables that are heavily
> > accessed are totally separate from those that aren't.
> Design it first so it does not matter. Benchmark it. Then decide.
> > Anybody got some spare underpants? (preferably not white ones)
> I recommend Marks & Spencer myself.
> > I want everything to be as realtime as possible. But this would mean
> > updating several tables for each of those hits, I get the nasty feeling
> > that will be too slow. So would it probably be better to have a cron job
> > updating some tables, every 10 minutes or so, and keep the heavily
> > updating to a single table?
> Realtime+many requests/sec = Big Bucks for Big Iron. Avoid, you almost
> certainly don't need Real realtime.
> Prepare yourself for a sore head, once with the learning and again with
> the banging on the office wall.
More information about the london.pm