The Decathlon of Computer Science
or: “how to make zope/python at least 10 times faster”. Kudos to jukart, dobee, benji and j1m!
During the last weeks the lovely team was working hard on adding speed to some of our zope3 based applications. We were fighting problems that started with wrong bits in the EEPROM of network interfaces up to high level conceptual/architectural problems. That’s why I had the idea to write the story of our Decathlon :)
Our sites have pretty different “load profiles”. I’ll pick two of them:
Lovely Books
it’s a lovely books community site :) here are some stats: about 1.000 users, 82.000 books, 45.000 authors, 6.500 tags, 17.000 reviews,… in total a few million different objects in ZODB (and groooooowing).
- Users are mainly accessing “the long tail”.
The database is beeing accessed randomly. - Every user get’s a personalized view.
As soon as the user is logged in, 99% of all pages can’t be cached. - We have a lot of pages online.
Google has roughly 170.000 different pages in his cache, caching all the pages is pretty senseless. - Asynchronous Tasks might block application server threads.
Calls to Amazons webservices,… block server threads while they are running. - Adding books, rating, commenting, tagging, making friends changes the relations, results, friends all the times.
A intelligent solution for cache invalidation is needed
Videoportal
Is a local video portal with quite some traffic. A few month after the official launch we’re serving up to 60MBit/s videos and have roughly half a Terabyte Data online. The top 10 videos had been viewed more than 100.000 times.
- Users are hammering top videos.
The main traffic is static (video data) - and can be cached. - Logged in users get personalized pages.
Sometimes it’s just the name of the logged in user on top of the page. - Live stats are important.
We need to keep track of the number of videos viewed, and we need to keep track of it on a pro rata temporis basis.
I think it is about time for a statement about the