Posted January 8, 2007 11:59 am by with 12 comments

Tweet about this on TwitterShare on LinkedInShare on Google+Share on FacebookBuffer this page

It seems like every other week, someone will come out with a story that just catches fire in the SEO community. After all, we SEOs need something to gossip about don’t we? The story of the week has been Google indexing CSS & JS files. From there Threadwatch & Wolf Howl picked it up.

I think most of these people have it wrong though. Do we really think that Google is worried about hidden text in this day and age? There are so many legitimate ways and reasons to place text on a page that the users do not see when it first loads. Just look at Yahoo’s homepage and use of CSS on-page tabbed browsing. I believe the only on-page spam techniques they really care about are auto generated content, cloaking, and redirects. Hidden text is just no longer a factor in Google’s SERPs.

So why do they want CSS and JavaScript files? They most likely want JavaScript files to detect JS redirects and other malicious code. With the CSS files, we have to think about what Google’s top priorities are these days. If I where Google, at the top of my list would be defending the integrity of link popularity as it’s what their algorithm is built upon. By combining the html, css, and javascript, Google can determine what sections of code are content and what sections are headers, footers, and sidebars. It was my theory that Google might actually be rendering the pages on the fly to make this determination, but John thinks they can do it without rendering as long as they have the files (read comments). Either way, I find it much more likely that Google’s using these files to determine how to weight links rather than trying to find hidden text.

  • There are other legitimate reasons Google could have for looking at CSS. Like so many other people in the SEO world, you’re blinding yourself by thinking only in terms of links.

    Google has always looked at on-page presentation as the first step in determining relevance. With mroe people depending upon CSS, it behooves Google to learn as much about what people are doing as possible. Otherwise, their ranking algorithm will become lopsided, favoring unmodified on-page factors too much.

    That would, in effect, open the door to a whole slew of new “pseudo-hidden text” abuses. I, for one, hope they are indeed looking closely at how pages should render through CSS, although that avenue has limits. For example, most CSS-dominated content is badly designed and the pages don’t render well when you look at the unmodified presentation.

    What should Google do? Prefer the pretty-looking CSS side or the unmodified side?

  • Well I’ll offer this up as evidence they might be looking at hidden text as well

  • Google could be attempting to isolate links but I doubt it. Most blogrolls (and, “Popular Posts” or “Recent Posts”) links lack before and after inline content; they’re simply links. Simple links don’t do much for link popularity do they.

    I find it difficult to believe that Google will be able to identify headers, footers and sidebars unless the CSS semantically identifies them, e.g., div id=”footer”.

    However, I could see Google attempting to identify and “read” server-side includes like Javascript’s document:write or PHP’s modules.

  • Great catch Micheal. The more posts I read from Matt Cutts on these types of incidences, the more I think it takes a manual review to do a full ban.

    I’d love to know how many people they have working in the anti-spam division doing manual reviews.

  • Google have been sending out letters to Adsense (ab)users regarding the positioning of pictures in relation to adverts. Positioning of those elements could be purely by CSS, so it could only be detected on all sites if they could read the CSS.

    I read recently they were discounting links that were using

    It seems logical as per your post that they will also start looking at linkage that you are in some way hiding, for instance linking though to naughty sites using dynamic javascript links or other forms of dynamic linking.

  • Targeting Adsense is understandable. I’ll agree that GoogleBot may “read” CSS but I don’t believe it can assign actual relevance to it. Rather, it parses styles, e.g., position:absolute or display:none, which triggers manual review.

    I’m more interested in when Google will assign CSS parsing to its algorithm(s) and how that will affect CSS use.

  • Sean,

    Why wouldn’t Google be able to assign relevance? It’s really not that hard. A couple years have gone by and I can’t find the link but a university released research showing how a search engine can tell what is the main content and what is the header, sidebar and so on. trust me, if they can do it, so can Google.

    I have no doubt that this sort of behavior is something they would be pushing towards in the effort to battle paid links.

  • Pingback: Blog Ryan » Google Searching CSS Files()

  • Jeremy,

    Do you remember if the research included HTML and CSS?

    I understand how search engines could ascertain main content versus secondary content through a site’s table, tr and td structures; and, they probably do the same with CSS unordered lists. As for search engines “understanding” CSS relevance I’m not sure I agree. One could insert paid links into the main content and, through CSS, “remove” those links from the HTML flow and set them some other place on the page. However, search engines will parse the HTML and read those links as being part of the main content.

  • CSS is just another form of markup. To an algorithm, CSS is really no different than HTML.

  • Pingback: Google tutkii myös CSS- ja Javascript-tiedostoja | Nettibisnes.Info()

  • interesting topic.keep it up

    NaldzGraphics’s last blog post..10 Best Grunge Designs that Rocks