Page III: Google's vice-president of engineering was in London this week to talk to potential recruits about just what lies behind that search page.
Google has two crucial factors in its favour. First, the whole problem is what Hölzle refers to as embarrassingly parallel, which means that if you double the amount of hardware, you can double performance (or capacity if you prefer -- the important point is that there are no diminishing returns as there would be with less parallel problems).
The second factor in Google's favour is the falling cost of hardware. If the index size doubles, then the embarrassingly parallel nature of the problem means that Google could double the number of machines and get the same response time so it can grow linearly with traffic. "In reality (from a business point of view) we would like to grow less than linear to keep costs down," said Hölzle, "but luckily the hardware keeps getting cheaper."
So every year as the Web gets bigger and requires more hardware to index, search and return Web pages, hardware gets cheaper so it "more or less evens out" to use Hölzle's words.
As the scale of the operation increases, it introduces some particular problems that would not be an issue on smaller systems. For instance, Google uses IDE drives for all its storage. They are fast and cheap, but not highly reliable. To help deal with this, Google developed its own file system -- called the Google File System, or GFS -- which assumes an individual unit of storage can go away at any time either because of a crash, a lost disk or just because someone stepped on a cable.
The power of three
There are no disk arrays within individual PCs; instead Google stores every bit of data in triplicate on three machines on three racks on three data switches to make sure there is no single point of failure between you and the data. "We use this for hundreds of terabytes of data," said Hölzle.
Don't expect to see GFS on a desktop near you any time soon -- it is not a general-purpose file system. For instance, a GFS block size is 64MB, compared with the more usual 2KB on a desktop file system. Hölzle said Google has 30 plus clusters running GFS, some as large as 2,000 machines with petabytes of storage. These large clusters can sustain read/write speeds of 2Gbps -- a feat made possible because each PC manages 2Mbps.
Once, said Hölzle, "someone disconnected an 80-machine rack from a GFS cluster, and the computation slowed down as the system began to re-replicate and we lost some bandwidth, but it continued to work. This is really important if you have 2,000 machines in a cluster." If you have 2000 machines then you can expect to see two failures a day.
Running thousands of cheap servers with relatively high failure rates is not an easy job. Standard tools don't work at this scale, so Google has had to develop them in-house. Some of the other challenges the company continues to face include:
Debugging: "You see things on the real site you never saw in testing because some special set of circumstances that create a bug," said Hölzle. "This can create non-trivial but fun problems to work on."
Data errors: A regular IDE hard disk will have an error rate in the order of 10-15 -- that is one millionth of one billionth of the data written to it may get corrupted and the hard-disk's own error checking will not pick it up. "But when you have a petabyte of data you need to start worrying about these failures," said Hölzle. "You must expect that you will have undetected bit errors on your disk several times a month, even with hardware checking built-in, so GFS does have an extra level of checksumming. Again this is something we didn’t expect, but things happen."
Spelling: Google wrote its own spell checker, and maintains that nobody know as many spelling errors as it does. The amount of computing power available at the company means it can afford to begin teaching the system which words are related -- for instance "Imperial", "College" and "London". It's a job that many CPU years, and which would not have been possible without these thousands of machines. "When you have tons of data and tons of computation you can make things work that don’t work on smaller systems," said Hölzle. One goal of the company now is to develop a better conceptual understanding of text, to get from the text string to a concept.
Power density: "There is an interesting problem when you use PCs," said Hölzle. "If you go to a commercial data centre and look at what they can support, you'll see a typical design allowing for 50W to 100W per square foot. At 200W per square foot you notice the sales person still wants to sell it but their international tech guy starts sweating. At 300W per square foot they cry out in pain."
Eighty mid-range PCs in a rack, of which you will find many dozens in a Google data centre, produce over 500W per square foot. "So we're not going to blade technology," said Hölzle. "We're already too dense. Finally Intel has realised this is a problem and is now focusing more on power efficiency, but it took some time to get the message across."
Quality of search results: One big area of complaints for Google is connected to the growing prominence of commercial search results -- in particular price comparison engines and e-commerce sites. Hölzle is quick to defend Google's performance "on every metric", but admits there is a problem with the Web getting, as he puts it, "more commercial". Even three years ago, he said, the Web had much more of a grass roots feeling to it. "We have thought of having a button saying 'give me less commercial results'," but the company has shied away from implementing this yet.
ZDNet UK's Matt Loney reported from London. For more coverage on ZDNet UK Insight, click here.








Can't resist being a little PC and finding the parallel between Klingon and Tagalog a bit weird. The latter is a real language spoken by tens of millions of people. Its name sure sounds funny, but is that enough? Better to mention Google's other funny options like "Bork, bork, bork" and "Elmer Fudd".