If nothing else, Twitter sure generates a lot of user data. As to the accuracy of the data there is no way to tell what’s what. When the data comes from Twitter directly, however, do you trust it more or less?
As reported in the Guardian, Evan Weaver, the lead engineer on the Twitter services team gave a little insight into the inner workings of the service but also told the audience at the QCon 2009 some tidbits like
- The average Twitter user has 126 followers (no mention as to how many are spam)
- Only 20% of the Twitter traffic comes through the website. Third party software on smart phones and computers contribute the rest.
- While the Iranian election has seen some traffic spikes it pales in comparison to the Obama inauguration, which had about 300 tweets per second (and that was at a time where there was considerably fewer users and the real hype around the service had yet to reach its peak)
While not earth shattering findings it is interesting how Weaver went on to talk about the architecture of Twitter and the myriad tools that are used to make everything happen in the Twitterverse.
Most of the tools used by Twitter are open source. The stack is made up of Rails for the front side, C, Scala and Java for the middle business layer, and MySQL for storing data. Everything is kept in RAM and the database is just a backup. The Rails front end handles rendering, cache composition, DB querying and synchronous inserts. This front end mostly glues together several client services, many written in C: MySQL client, Memcached client, a JSON one, and others
With little going on at the start of the Fourth of July week you may want to check just what’s under the hood of Twitter. Just reading that last paragraph alone, however, gave me a headache. I think I’ll go tweet about it.