Posted June 11, 2007 1:09 pm by with 6 comments

Tweet about this on TwitterShare on LinkedInShare on Google+Share on FacebookBuffer this page

Having worked closely with latent semantic indexing, during my time at FI, I’ve become a big advocate of making sure you have structured themes in your content, and that you include a supporting cast of semantically connected keywords.

In this clip from SMX Advanced, Matt Cutts shares how Google is continually testing the use of LSI, and keyword themes.

  • It is not so important that you structure your documents very rigidly, like a machine would do. It is more important that the Search Engine has a linguistical database that it can refer to. Having a linguistical database also does a much butter job with Cross – Language – Searches.

    Cross-language Search: What’s it all about?

    The term “cross-language search” is used in many different senses:

    1. Some search engine providers claim to support multilingual or cross-language search if they can handle and index documents written in different languages. They search for the exact appearance of the entered search terms, e.g. “war” finds English documents referring to military actions and it finds German documents containing “war” in the sense of “was” (i.e. a meaningless glue word).

    2. Other search engines (see, e.g., provide a tool for the translation of a query into a selectable other language, and then, the query is submitted with the translated query text. This is certainly a progress and can be useful in some specific situations, e.g. if one is looking for a hairdresser in Paris.

    – If one is looking for “member of the board” and “SAir Group” (Swissair) and searches for German documents, the translated query “Mitglied des Brettes” und “SAir Gruppe” won’t provide any results. If “member of the board” is replaced by “Aufsichtsrat” some documents are found but they do not correspond to the commonly used terms “Verwaltungsrat” or “Verwaltungsräte” in conjunction with the Sair case.
    – For information research and intelligence services the above-mentioned method does not help because it is not able to compare and rank documents written in different languages.

    3. A true cross-language search is possible only if the search engine is able to recognize the thematic content, i.e., if the system realizes that the English translation of a French (or a German etc.) document is equivalent to the original document. This advanced technique is implemented in It simultaneously finds documents in all supported languages, without the need for a cumbersome (and arbitrary) translation into each other language. Because of the cross-language content recognition and a well-founded similarity measure, the documents can be ordered by their relevance with respect to the query.

  • In other words, search engines are getting smarter.

  • Pingback: SMX Conference Notes - Webmaster Forum()

  • Pingback: A Call to SEOs Claiming to Sell LSI « IR Thoughts()

  • Pingback: Matt Cutts Spiega Il Latent Semantic Indexing | Consulenza Web Marketing Fabio Uncinotti()

  • Pingback: » What is Latent Semantic Analysis and “Natural Phrase” Targeting And How Can You Use Them For Search Optimization? | Interenet Marketing Experts()