Numbers Everyone Should Know

I was looking over a presentation titled "Designs, Lessons and Advice from Building Large Distributed Systems" By Jeff Dean from Google when I came across a really useful slide. Its the 24th slide in the presentation and is titled "Numbers Everyone Should Know". It has the latency of some common processor/network operations:

L1 cache reference..............................0.5ns
Branch mispredict.................................5ns
L2 cache reference................................7ns
Mutex lock/unlock................................25ns
Memory reference................................100ns
Compress 1K bytes with Zippy..................3,000ns
Send 2k bytes over 1Gbps network.............20,000ns
Read 1MB sequentially from memory...........250,000ns
Round trip within datacenter................500,000ns
Disk seek................................10,000,000ns
Read 1MB sequentially from disk..........20,000,000ns
Send packet CA->Netherlands->CA.........150,000,000ns

Having numbers like these are really useful for just ballpark estimates. I think the biggest surprise to me was the huge disparity in a datacenter round trip to disk seek. We all know how slow the disk is, but the fact that you can make 20 round trips within a data center in the time it takes just to make a disk seek (not even reading any data!) was pretty interesting. It reminds me a lot of Jim Grays famous storage latency picture which shows that if the registers were how long it takes you to fetch data from your brain then disk is the equivalent to fetching data from pluto.

0 comments:

Post a Comment