|
Home > Archive > Postgresql Announcements > November 2006 > database web programming, scrap the appserver, use aolserver, stuff in ram?
You are viewing an archived Text-only version of the thread.
To view this thread in it's original format and/or if you want to reply to
this thread please [click here]
| Author |
database web programming, scrap the appserver, use aolserver, stuff in ram?
|
|
| gavino 2006-10-26, 7:14 pm |
| I READ THIS ON PHILIP GREENSPUND'S SITE, IS IT TRUE THE DB WILL FLY
WHEN YOU HAVE ENUF RAM TO HOLD THE DB IN MEMORY?
http://philip.greenspun.com/wtr/app...on-servers.html
There is no scalability problem
My friend Jin and I spent some spare evenings building
http://www.scorecard.org for the Environmental Defense Fund. When a
user types in his zip code, the server shows him a map of the factories
near his house. Clicking on a factory will list the chemicals released.
Clicking on a chemical will list its health effects. The site was
featured on ABC World News, in Newsweek, in the New York Times, on CNN,
and was a Yahoo Pick of the Week. Every single page on the site is
generated on-the-fly by querying a relational database management
system (RDBMS). Some pages require five SQL queries. Each page requires
at least one. The site gets about 30 requests/second at peaks (on days
when traffic is over 500,000 hits). There are only a handful of sites
on the Internet that serve a larger number of db-backed pages.
Our hardware for this monstrously popular site? A Sun Microsystems
SPARC Ultra 2 pizza box Unix machine, built in 1996. Its dual 167-MHz
CPUs would be laughed at by the average Quake-playing 10-year-old. The
CPUs sit idle 80% of the time. The disks sit idle most of the time,
partly because I spent $4,000 on enough RAM to hold the entire 750 MB
data set. Oh yes, the machine also serves a few hundred thousand
hits/day for other customers of arsdigita.com and runs the street
cleaning and birthday reminder services that we built.
If we tarred up the site and moved it to a mid-range Unix server such
as the HP K460 that sits behind http://www.photo.net, we could probably
serve at least 5 million hits/day. If we moved it to the highest-end HP
server, I'd bet that we could get close to the 100-million hit/day mark
that sites like Yahoo serve.
Why has "scalable" become the buzzword du jour? People get burned
because they do stupid things. They connect their Web server to their
RDBMS via CGI, thus forcing the machine to work 10-20 times as hard for
no good reason. They run Windows NT. They run some unproven
junkware/middleware that came in an attractive box. Services get wedged
and they run out and buy another dozen (or thousand, as with
www.microsoft.com) physical computer systems. Now that they have a
whole machine room full of hardware, they know that they can't keep it
all running simultaneously so they look for software to yoke it all
together somehow such that the death of one machine won't be noticed.
How do my friends and I avoid scalability problems? We know that we're
stupid. We run the Oracle 8 RDBMS like the rest of the world and don't
try to figure out if some new competitor's hype has any relationship to
reality. We talk to the RDBMS via AOLserver, which has been doing
connection pooling from a Tcl API since 1995. So we get the safety and
software develop ease of Perl/CGI but the computer never has to fork a
CGI process and the database connections are shared among the scripts.
We've served roughly 1 billion hits with AOLserver so we're pretty sure
that it works. Linux and NT get magazine writers excited, but we run
the same commercial versions of Unix on which the Fortune 500 relies
for its enterprise computing.
There is no reliability/availability problem
The Internet is not perfectly reliable. Users can't expect to get IP
connectivity from their own internet service provider (ISP). If their
ISP is down then they can't get to your site or anyone else's so they
generally won't actually bet their life on your server being reachable.
The main reason to have high availability is ego. You don't want people
to think that you're incompetent, e.g., a friend of mine said he
wouldn't buy a ticket from the United Airlines Web site because it was
so unreliable at the front end that he figured they'd have screwed up
the back end to the point that his reservation would never actually get
made.
| |
| codeWarrior 2006-11-07, 7:13 pm |
| I fundamentally agree with your persepective on scalability / performance --
but I'd like to point out that 30 queries per second isnt really anything
astonishing or amazing...
How would you like a postgreSQL backed J2EE system that almost hits 1% CPU
load at 391 queries per second ? That might be something to brag about
wouldn't it ?
"gavino" <bootiack@yahoo.com> wrote in message
news:1161907981.173276.293840@i42g2000cwa.googlegroups.com...
>I READ THIS ON PHILIP GREENSPUND'S SITE, IS IT TRUE THE DB WILL FLY
> WHEN YOU HAVE ENUF RAM TO HOLD THE DB IN MEMORY?
> http://philip.greenspun.com/wtr/app...on-servers.html
>
> There is no scalability problem
> My friend Jin and I spent some spare evenings building
> http://www.scorecard.org for the Environmental Defense Fund. When a
> user types in his zip code, the server shows him a map of the factories
> near his house. Clicking on a factory will list the chemicals released.
> Clicking on a chemical will list its health effects. The site was
> featured on ABC World News, in Newsweek, in the New York Times, on CNN,
> and was a Yahoo Pick of the Week. Every single page on the site is
> generated on-the-fly by querying a relational database management
> system (RDBMS). Some pages require five SQL queries. Each page requires
> at least one. The site gets about 30 requests/second at peaks (on days
> when traffic is over 500,000 hits). There are only a handful of sites
> on the Internet that serve a larger number of db-backed pages.
>
> Our hardware for this monstrously popular site? A Sun Microsystems
> SPARC Ultra 2 pizza box Unix machine, built in 1996. Its dual 167-MHz
> CPUs would be laughed at by the average Quake-playing 10-year-old. The
> CPUs sit idle 80% of the time. The disks sit idle most of the time,
> partly because I spent $4,000 on enough RAM to hold the entire 750 MB
> data set. Oh yes, the machine also serves a few hundred thousand
> hits/day for other customers of arsdigita.com and runs the street
> cleaning and birthday reminder services that we built.
>
> If we tarred up the site and moved it to a mid-range Unix server such
> as the HP K460 that sits behind http://www.photo.net, we could probably
> serve at least 5 million hits/day. If we moved it to the highest-end HP
> server, I'd bet that we could get close to the 100-million hit/day mark
> that sites like Yahoo serve.
>
> Why has "scalable" become the buzzword du jour? People get burned
> because they do stupid things. They connect their Web server to their
> RDBMS via CGI, thus forcing the machine to work 10-20 times as hard for
> no good reason. They run Windows NT. They run some unproven
> junkware/middleware that came in an attractive box. Services get wedged
> and they run out and buy another dozen (or thousand, as with
> www.microsoft.com) physical computer systems. Now that they have a
> whole machine room full of hardware, they know that they can't keep it
> all running simultaneously so they look for software to yoke it all
> together somehow such that the death of one machine won't be noticed.
>
> How do my friends and I avoid scalability problems? We know that we're
> stupid. We run the Oracle 8 RDBMS like the rest of the world and don't
> try to figure out if some new competitor's hype has any relationship to
> reality. We talk to the RDBMS via AOLserver, which has been doing
> connection pooling from a Tcl API since 1995. So we get the safety and
> software develop ease of Perl/CGI but the computer never has to fork a
> CGI process and the database connections are shared among the scripts.
> We've served roughly 1 billion hits with AOLserver so we're pretty sure
> that it works. Linux and NT get magazine writers excited, but we run
> the same commercial versions of Unix on which the Fortune 500 relies
> for its enterprise computing.
> There is no reliability/availability problem
> The Internet is not perfectly reliable. Users can't expect to get IP
> connectivity from their own internet service provider (ISP). If their
> ISP is down then they can't get to your site or anyone else's so they
> generally won't actually bet their life on your server being reachable.
> The main reason to have high availability is ego. You don't want people
> to think that you're incompetent, e.g., a friend of mine said he
> wouldn't buy a ticket from the United Airlines Web site because it was
> so unreliable at the front end that he figured they'd have screwed up
> the back end to the point that his reservation would never actually get
> made.
>
|
|
|
|
|