Archive for March, 2007

The Decathlon of Computer Science

or: “how to make zope/python at least 10 times faster”. Kudos to jukart, dobee, benji and j1m!

During the last weeks the lovely team was working hard on adding speed to some of our zope3 based applications. We were fighting problems that started with wrong bits in the EEPROM of network interfaces up to high level conceptual/architectural problems. That’s why I had the idea to write the story of our Decathlon :)
Our sites have pretty different “load profiles”. I’ll pick two of them:

Lovely Books
it’s a lovely books community site :) here are some stats: about 1.000 users, 82.000 books, 45.000 authors, 6.500 tags, 17.000 reviews,… in total a few million different objects in ZODB (and groooooowing).

  • Users are mainly accessing “the long tail”.
    The database is beeing accessed randomly.
  • Every user get’s a personalized view.
    As soon as the user is logged in, 99% of all pages can’t be cached.
  • We have a lot of pages online.
    Google has roughly 170.000 different pages in his cache, caching all the pages is pretty senseless.
  • Asynchronous Tasks might block application server threads.
    Calls to Amazons webservices,… block server threads while they are running.
  • Adding books, rating, commenting, tagging, making friends changes the relations, results, friends all the times.
    A intelligent solution for cache invalidation is needed

Videoportal
Is a local video portal with quite some traffic. A few month after the official launch we’re serving up to 60MBit/s videos and have roughly half a Terabyte Data online. The top 10 videos had been viewed more than 100.000 times.

  • Users are hammering top videos.
    The main traffic is static (video data) - and can be cached.
  • Logged in users get personalized pages.
    Sometimes it’s just the name of the logged in user on top of the page.
  • Live stats are important.
    We need to keep track of the number of videos viewed, and we need to keep track of it on a pro rata temporis basis.

continue reading the full article

simple ZODB performance settings

The use of the database can be optimized by using a bigger cache in the client.

For a single zope instance without a zeo server this can be done by increasing the number of cached objects.

The cache.size parameter in etc/local.conf must be set (default is 5000) :


cache-size 50000
path $DATADIR/Data.fs

The default size of 5000 is no longer usefull for a database with 200000 objects.

For a zeo client it is possible to configurate the client cache memory size :

server localhost:8101 storage 1
# ZEO client cache, in bytes
cache-size 300MB
# Uncomment to have a persistent disk cache
#client zeo1

The size should be set to a high value because otherwise a lot of network traffic is created between zeo and the client, this is also true if the client and zeo is on the same machine using localhost for the connection.

Lovely Books

lovelybooksI think it is about time for a statement about the Lovelybooks project. Lovely Books is a social networking site based on books whch people read and other people write. The site is currently in very early beta and released in german only. The idea behind it is to find new books and people on the basis of books you have read already. We are working on this project with a team of german publishers. They are really interested to move things forward and to not make the mistake the music industry has made. When co-founding Last.fm i got some insight into the music sector when times were not that easy and i want to point out the differences and things in common between books and music on the web.

continue reading the full article