Remove your listing from Google Suplemental Index

Advertisement

I was called to dig this topic further when I read John Chow (again ?? icon smile Remove your listing from Google Suplemental Index ) post where he lost his 1st ranking in Google for keyword ‘make money online’. For those who never heard of John Chow before, (I’m not promoting him ok:), you can type ‘make money online’ on google search box, and enter. There you go. John Chow is in front page. This is so evil genius

Few weeks back, he has lost his 1st rank for this keyword. I’m not going to give full sequence of what was happen to his blog (you can read if from there), but I’m more interested on finding by SEORefuge who has predict correctly on what was happen on John Chow’s blog

SEORefuge has predicted earlier on their blog that, he might do some changes on his blog’s robots.txt file. This changed has resulting his favorite keyword not ranked 1st in Google. SEORefuge prediction turned to be true when they comparing cache version of John Chow robots.txt file when the rank is dropping, and what robots.txt file looks like now. Its amazing, how powerful is robots.txt file in determining your rank for certain keyword in search engine!!

Robots.txt file
Frankly speaking, I never used robots.txt file before, except for my XML sitemap (for auto discovery purpose) and google adsense bot from Google Webmaster tool. To know more about robots.txt file, I’m suggesting you browsing over to Wikipedia since the official robots.txt website content is not so much up to date

Google Supplemental Index
Other than robots.txt file, another factor that might affect your ranking in search engine is called Google Supplemental Index. This is not something new, but more something that I ignored before. I never really care about it (poor me). Basically, Supplemental Index is where the unworthy pages end up. Some SEO expert (and blogger) point out that, the more you page indexed fall into Supplemental Index, the less search engine will bring visitors to you website due to the frequency update on Supplemental Index is not as frequent as the main index

There is an interesting articles by Nathan from Not So Boring Life about how to get rid of Google Supplemental Index. From his article, he teaches how to identify how many of your pages are in Supplemental Index and how to get rid of it. I summarize here what on in his article.

To identify which pages are in Supplementary Index

  • Run this query on google (site:www.yoursite.com *** -view)
  • Using Aron Wall’s SEO toolbar

Get rid of Supplemental Index
To get rid of it, you have to exclude unnecessary contents of you website from indexed by search engine. The unnecessary contents or files would be your image, profile,plugin and etc.

If your website is a blog type, you can read further his article since it was written specifically for that. My website is not a blog type , but basically I get an idea from his robots.txt file

How to exclude dynamic pages from getting indexed.
1 thing that I notice is, half of my website indexed (use this query site:www.yoursite.com *** -view) is dynamic pages such as print function, comment, and RSS. It account roughly half from the total pages indexed. I have dig around again and found few articles about how to exclude them. Hrmm, even robots.txt official website and Wikipedia didn’t mention about it.

This is how i exclude the dynamic pages. The reason why I put both ways is because, I’m not sure which one is correct. According to this 2 sources below, it works as expected.

User-agent: *
Disallow: /*?
Disallow: /?

UPDATE:16-6-2007: I have confirmed the correct way should be the first one.
http://www.google.com/support/webmasters/bin/answer.py?answer=35303&hl=en

http://www.webmasterworld.com/forum93/534.htm
http://forums.digitalpoint.com/showthread.php?t=106

More reading on Google Supplemental Index
There are a lot of discussion about Supplemental Index. I grab few quote and post made by folks from Google. The way the try explaining to webmaster is, Google Supplemental Index is not bad. Your thought??

Post from Adam Lasnik of google

Pages are in the supplemental results because we still wanted to be able to show them to users, but the pages didn’t have enough PageRank to make it into our main index (which is more extensive and updated with greater frequency).

Quote from Mat Cutt of Google (quote’s by other forumer)

having supplemental results these days is not such a bad thing. In your case, I think it just reflects a lack of PageRank/links. We’ve got your home page in the main index, but if you look at your site … you’ll see not a ton of links … So I think your site is fine … it’s just a matter of we have to select a smaller number of documents for the web index. If more people were linking to your site, for example, I’d expect more of your pages to be in the main web index.

(post of Mat Cutt of google in his blog)

That statement still holds. It’s perfectly normal for a website to have pages in our main web index and our supplemental index. If a page doesn’t have enough PageRank to be included in our main web index, the supplemental results represent an additional chance for users to find that page, as opposed to Google not indexing the page
Getting more *quality* backlinks is generally a good way to get more of your pages in the main index.

Notes: after implementing robots.txt to exclude unnecessary folders from getting indexed, please wait for 2-3 weeks before you will see the result. btw, I’m not implementing what I wrote above on this blog, but on my other website

Related Posts


Advertisement

2 Responses to “Remove your listing from Google Suplemental Index”

  1. Dubai news says:

    Getting out the duplicate pages with no original content helps beat the duplicate content filters that google is using, people also need to look at what they are doing with their PR. Easy things like making pages such as about us and contact no follow, also no reason to push your PR to sites like technorati who no follow all the links back to you. That kind of selective nofollow changes addresses the PR questions that are not helped by robots.txt changes.

  2. zaki says:

    That is the hardest part for article directory since we receive hundred of articles everyday, we can filter out duplicate in our own system, but no with articles published out there.

    Technorati might applying nofollow tag, but don’t forget technorati exposed our articles to millions readers out there. With correct tag and title, our posts will reach to the users. But I personally prefer digg and stumble upon. I do get more traffic from there.

Leave a Reply