Posted June 4, 2007 10:35 am by with 7 comments

Tweet about this on TwitterShare on LinkedInShare on Google+Share on FacebookBuffer this page

Ok, so the New York Times doesn’t exactly get Google’s top algorithm execs to tell us how the search engine calculates search results, but they do get fresh insight as to how Google decides to update it’s technology.

The article includes interviews with Amit Singhal, Matt Cutts and Udi Manber.

Insights include details of Google’s internal system for evaluating search queries, called Debug.

At other times, complaints highlight more complex problems. In 2005, Bill Brougher, a Google product manager, complained that typing the phrase “teak patio Palo Alto” didn’t return a local store called the Teak Patio.

So Mr. Singhal fired up one of Google’s prized and closely guarded internal programs, called Debug, which shows how its computers evaluate each query and each Web page. He discovered that did not show up because Google’s formulas were not giving enough importance to links from other sites about Palo Alto.

Boy, wouldn’t you like to get your hands on that little beauty?

Also, for the longest time, we’ve known that Google uses more than 100 variables in its algorithm. Since search has become more complex, that number has now been updated.

Mr. Singhal has developed a far more elaborate system for ranking pages, which involves more than 200 types of information, or what Google calls “signals.” PageRank is but one signal. Some signals are on Web pages — like words, links, images and so on. Some are drawn from the history of how pages have changed over time. Some signals are data patterns uncovered in the trillions of searches that Google has handled over the years.

What caught your attention?

  • This is actually my first time at your site. Great work and excellent content. And yes, I would love to get my hands on Debug. Who wouldn’t?

  • Heh I was about to note my favorite paragraph and then realized you quoted the exact same one.

    Most of the article IMO wasn’t too enlightening except the very last paragraph Andy quoted.

    I wanted to hear which factors as of late get most importance with regards to ranking, of course they wouldn’t tell us that 🙂

  • Pingback: Are all Search Engines Brain Damaged? « BlogWell()

  • I thought the concept of QDF (query deserves freshness) was interesting and it’s good to see Google understand that content doesn’t have to be old to be relevant.

    I think much in the article we either knew or at least thought we knew, but it’s always nice to have confirmation. The walk through what happens between a query and the results being presented was worth the read.

    “The sites with the 10 highest scores win the coveted spots on the first search page, unless a final check shows that there is not enough “diversity” in the results.”

    That quote made me wonder what’s meant by ‘diversity.’ Are we talking about duplicate content? More likely it’s what the article says in the following

    “If someone types a product, for example, maybe you want a blog review of it, a manufacturer’s page, a place to buy it or a comparison shopping site.”

    But if Google can really understand the intent of a query would they need to diversify the results?

  • 2 or 3 cats out of the bag. 197 to go! 😛

  • Is this something new? Tell us the exact algorithm :))

  • “That quote made me wonder what’s meant by ‘diversity.’ Are we talking about duplicate content? More likely it’s what the article says in the following”

    I am guessing diversity meaning between scrappers and possibly cloned websites from the same owner. i.e. mini-network?