Monday, January 8th, 2007 by Jeremy Luebke

12

Google and CSS Files, It’s Not What You’re Thinking

It seems like every other week, someone will come out with a story that just catches fire in the SEO community. After all, we SEOs need something to gossip about don’t we? The story of the week has been Google indexing CSS & JS files. From there Threadwatch & Wolf Howl picked it up.

I think most of these people have it wrong though. Do we really think that Google is worried about hidden text in this day and age? There are so many legitimate ways and reasons to place text on a page that the users do not see when it first loads. Just look at Yahoo’s homepage and use of CSS on-page tabbed browsing. I believe the only on-page spam techniques they really care about are auto generated content, cloaking, and redirects. Hidden text is just no longer a factor in Google’s SERPs.

So why do they want CSS and JavaScript files? They most likely want JavaScript files to detect JS redirects and other malicious code. With the CSS files, we have to think about what Google’s top priorities are these days. If I where Google, at the top of my list would be defending the integrity of link popularity as it’s what their algorithm is built upon. By combining the html, css, and javascript, Google can determine what sections of code are content and what sections are headers, footers, and sidebars. It was my theory that Google might actually be rendering the pages on the fly to make this determination, but John thinks they can do it without rendering as long as they have the files (read comments). Either way, I find it much more likely that Google’s using these files to determine how to weight links rather than trying to find hidden text.


Social Media Monitoring in Just 60-Seconds. Guaranteed!

Similar Stories in: General, Search | Forward: Email This Post

Share this post

Share on Twitter Share in Google Buzz Stumble This!Bookmark on DeliciousShare on FriendFeedDigg This!Share on Facebook

 

12 comments on “Google and CSS Files, It’s Not What You’re Thinking”

  1. Michael Martinez Says:

    January 8th, 2007 at 12:20 pm

    There are other legitimate reasons Google could have for looking at CSS. Like so many other people in the SEO world, you’re blinding yourself by thinking only in terms of links.

    Google has always looked at on-page presentation as the first step in determining relevance. With mroe people depending upon CSS, it behooves Google to learn as much about what people are doing as possible. Otherwise, their ranking algorithm will become lopsided, favoring unmodified on-page factors too much.

    That would, in effect, open the door to a whole slew of new “pseudo-hidden text” abuses. I, for one, hope they are indeed looking closely at how pages should render through CSS, although that avenue has limits. For example, most CSS-dominated content is badly designed and the pages don’t render well when you look at the unmodified presentation.

    What should Google do? Prefer the pretty-looking CSS side or the unmodified side?

  2. graywolf Says:

    January 8th, 2007 at 2:12 pm

    Well I’ll offer this up as evidence they might be looking at hidden text as well

    http://www.thedisneyblog.com/tdb/2007/01/the_disney_blog.html

  3. Sean Fraser Says:

    January 8th, 2007 at 2:35 pm

    Google could be attempting to isolate links but I doubt it. Most blogrolls (and, “Popular Posts” or “Recent Posts”) links lack before and after inline content; they’re simply links. Simple links don’t do much for link popularity do they.

    I find it difficult to believe that Google will be able to identify headers, footers and sidebars unless the CSS semantically identifies them, e.g., div id=”footer”.

    However, I could see Google attempting to identify and “read” server-side includes like Javascript’s document:write or PHP’s modules.

  4. Jeremy Luebke Says:

    January 8th, 2007 at 2:51 pm

    Great catch Micheal. The more posts I read from Matt Cutts on these types of incidences, the more I think it takes a manual review to do a full ban.

    I’d love to know how many people they have working in the anti-spam division doing manual reviews.

  5. Andy Beard Says:

    January 8th, 2007 at 4:21 pm

    Google have been sending out letters to Adsense (ab)users regarding the positioning of pictures in relation to adverts. Positioning of those elements could be purely by CSS, so it could only be detected on all sites if they could read the CSS.

    I read recently they were discounting links that were using on.click.

    It seems logical as per your post that they will also start looking at linkage that you are in some way hiding, for instance linking though to naughty sites using dynamic javascript links or other forms of dynamic linking.

  6. Sean Fraser Says:

    January 8th, 2007 at 5:05 pm

    Targeting Adsense is understandable. I’ll agree that GoogleBot may “read” CSS but I don’t believe it can assign actual relevance to it. Rather, it parses styles, e.g., position:absolute or display:none, which triggers manual review.

    I’m more interested in when Google will assign CSS parsing to its algorithm(s) and how that will affect CSS use.

  7. Jeremy Luebke Says:

    January 8th, 2007 at 5:12 pm

    Sean,

    Why wouldn’t Google be able to assign relevance? It’s really not that hard. A couple years have gone by and I can’t find the link but a university released research showing how a search engine can tell what is the main content and what is the header, sidebar and so on. trust me, if they can do it, so can Google.

    I have no doubt that this sort of behavior is something they would be pushing towards in the effort to battle paid links.

  8. Blog Ryan » Google Searching CSS Files Says:

    January 8th, 2007 at 5:52 pm

    [...] Seems the topic of the day is Google searching css files. The theories range from whether Google is searching for hidden text, or whether they are using the files to filter content from navigation. The fact of the matter is, Google is getting more and more sophisticated in determining exactly what a site is about and making things more difficult for the black hats. That doesn’t mean the black hats won’t find a way around the things that Google is doing, it just means only the best black hats will survive. I just hope that Google is taking a cautious approach on how they use the css files. Many designers use css to position the text of their images either off the page or make it invisible so that the text can appear exactly as they had intended. I’m hoping that if the purpose is to detect hidden links or text that Google doesn’t have the dial turned up too high. There are many legitimate reasons to render text invisible or off-page that is not meant to deceive the user OR the crawlers. Add This Blog Post to Other Sites:These icons link to social bookmarking sites where readers can share and discover new web pages. [...]

  9. Sean Fraser Says:

    January 8th, 2007 at 8:23 pm

    Jeremy,

    Do you remember if the research included HTML and CSS?

    I understand how search engines could ascertain main content versus secondary content through a site’s table, tr and td structures; and, they probably do the same with CSS unordered lists. As for search engines “understanding” CSS relevance I’m not sure I agree. One could insert paid links into the main content and, through CSS, “remove” those links from the HTML flow and set them some other place on the page. However, search engines will parse the HTML and read those links as being part of the main content.

  10. Jeremy Luebke Says:

    January 8th, 2007 at 10:02 pm

    CSS is just another form of markup. To an algorithm, CSS is really no different than HTML.

  11. Google tutkii myös CSS- ja Javascript-tiedostoja | Nettibisnes.Info Says:

    January 18th, 2007 at 12:04 am

    [...] Google and CSS Files, It’s Not What You’re Thinking – Jeremy Luebke epäilee, että Googlen algoritmit eivät paljoakaan välitä piilotekstistä vaan lähinnä Javasript-uudelleenohjauksista ja linkkien sijainnista sivulla. Ylä-, ala- ja sivupalkissa olevat linkit saisivat vähemmän painoarvoa niiden kohdesivujen rankkauksessa kuin tekstin keskellä sijaitsevat. [...]

  12. NaldzGraphics Says:

    August 6th, 2008 at 8:38 am

    interesting topic.keep it up

    NaldzGraphics’s last blog post..10 Best Grunge Designs that Rocks