A minor correction. Just want to share information I have. As of July 2001, google ran 10,000 PCs spread over 4 co-locations, two on the Eastern coast, two on the Western coast. PC is an el-cheapo design (in google's sysadmin words) and the rate of failures is 52 PCs a day. They don't fix them, just replace and rebuild.
This comes from my notes from BayLISA meeting of July 19, 2001 where Frank Cusack, system administrator at Google, made a presentation about Network and Machine Architecture at Google
What kind of failures do they have? 52 PC every day looks like A LOT! 0,52% of chance for a PC to fail every day! If the chance is the same in the time, in 100 days they would replace 52% of all pcs (or there is 52% of chance to have one pc broken!). Ok, I am doing the assumption that the chance is always the same, and that is not obiously true, but it still looks like they have a lot of failures. Anyone have some further information on this? Praise