The secret at the heart of Google

(continued from previous page)

"You make the software tolerate failures. If you can expect failures, then this is what makes cheap commodity PCs viable for Internet services," Hoelzle said.

Google's PC servers, which number in the thousands, run a stripped-down version of Linux, which is based on the Red Hat distribution but is really just the operating system kernel modified for Google, he added.

The company has also devised a system for handling massive amounts of data and returning rapid responses to queries. Google splits the Web into millions of pieces, or "shards" in Google tech speak, which are replicated in case of failure.

Not surprisingly, the company creates an index of words that appear on the Web, which it stores as an array of large files. But it also has document servers, which hold copies of Web pages that Google crawls and downloads.

Another important engineering feat done by Google is to make writing programs that run across thousands of servers very straightforward, according to Hoelzle. Normally, building applications to run in a "parallel" configuration of servers requires specialised tools and skills.

Google's programming tool, called MapReduce, which automates the task of recovering a program in case of a failure, is critical to keeping the company's costs down.

"Cost is really the sum of what the equipment you need to do the work costs and how much programming time you need to put into getting something useful," Hoelzle said, adding that Google has started using MapReduce more widely over the past year.

Finally, Google has created "batch" job scheduling software that acts as a sort of taskmaster for millions of operations. Called the Global Work Queue, it breaks up computing jobs into many smaller tasks and distributes them across machines.

For all its built-in redundancy in case of failure, the system doesn't address all problems, Hoelzle revealed. During the presentation, he showed a photo of six fire trucks responding to an emergency at a Google data centre in an undisclosed location. He would not reveal any specific details on the mishap except to say that "it wasn't about one machine going down."

In a follow-up interview, Hoelzle said the cost of power is another important factor in Google's data centre designs.

"The physical cost of operations, excluding people, is directly proportional to power costs," he said. "(Power) becomes a factor in running cheaper operations in a data center. It's not just buying cheaper components but you also have to have an operating expense that makes sense."

Like this article? Click below to send it to your mobile for free!

Talkback 0 comments


Latest Videos

Sponsored content

Power Centre - Content from our premier sponsors

Blogs

  • Renai LeMay Australian Govt funds IT start-ups
    This week Australia's Federal Government announced it had allocated $3.6 million in funding to 57 local research projects so that they could be commercialised, with many of them being web or IT-related start-ups.
  • Array Google should come clean on datacentres
    It's nice that Google says it has put an effort into making its datacentres more energy efficient, but the search giant's pledges won't mean much until it discloses just how many of the beasties it's actually running.
  • Array US shows what OPEL could have been
    Sprint's WiMAX roll-out in Baltimore will prove the Australian government's decision to worm its way out of the Opel WiMAX contract was a short-sighted, and ultimately damaging, political stunt that has benefited nobody.
  • More blogs »

Tags

Back to top

Featured