Beat Google Adword easily
Powered by MaxBlogPress 


The collaboration between Google, Yahoo and Microsoft has agreed on the new sitemap protocol called cross domain sitemap, where you can hosted multiple sitemaps for multiple domains on a single domain. I love seeing them collaborated on something that beneficial to webmasters. This is very helpful to everyone, especially those who has many websites to be managed.

Robots.txt and sitemap is essential part of any blog or websites nowadays. It provide maximum exposure and website-friendly to the search engine robot crawler by telling them which one you should index and which one shouldn’t . I’m not going it in detail, but you can googling or wikipeding to find more about these 2 files.

Ok, back to the main topic, how does the new protocol help webmaster? The reason is quite obvious. All in 1!! You control everything from 1 place, save your precious time and energy.

How it works?
The sample below from sitemap.org should provide you better idea about it.

Let say you have 3 websites and each of them, has it’s own sitemap hosted on it’s own domain

www.host1.com with Sitemap file sitemap-host1.xml
www.host2.com with Sitemap file sitemap-host2.xml
www.host3.com with Sitemap file sitemap-host3.xml

Now, with the new protocol, you can host all 3 sitemaps on single host, e:g sitemaphost.com for a better control. This is how it looks like in sitemaphost.com’s robots.txt file

Sitemap: http://www.sitemaphost.com/sitemap-host1.xml
Sitemap: http://www.sitemaphost.com/sitemap-host2.xml
Sitemap: http://www.sitemaphost.com/sitemap-host3.xml

Finally, update host1.com’s robots.txt at http://www.host1.com/robots.txt to have this line below

Sitemap: http://www.sitemaphost.com/sitemap-host1.xml

Why you must update this part? By modifying the robots.txt file on www.host1.com and having it point to the Sitemap on www.sitemaphost.com, you have implicitly proven that you own www.host1.com. In other words, whoever controls the robots.txt file on www.host1.com trusts the Sitemap at http://www.sitemaphost.com/sitemap-host1.xml to contain URLs for www.host1.com. The same process can be repeated for the other two hosts.

Don’t forget to validate your robots.txt and sitemap

How to ping search engine once you have updated your website’s sitemap?
Each search engines has different method. Check out information below

Microsoft - http://webmaster.live.com/ping.aspx?siteMap=[Your sitemap URL]

Google - Follow this guide of how to resubmit your sitemap to Google
www.google.com/webmasters/tools/ping?sitemap=sitemap_url

Yahoo - http://search.yahooapis.com/SiteExplorerService/V1/ping?sitemap=[your website URL] . This service applied only if you have your website submitted to Yahoo Siteexplorer

update 5-May-2008 : correcting the missing ‘sitemap‘ on the robot.txt file in sitemaphost

Popularity: 8% [?]

Related Posts

 
If you like this post then please consider subscribing to my full feed RSS
You can also subscribe by Email and have new posts sent directly to your inbox.



Free Web Hosting

Add to Google

 

RSS feed | Trackback URI

16 Comments »

Comment by KNizam
2008-03-05 15:00:00

this sounds like jargon to me. heheh tak paham pokcik eh

 
Comment by azwanhadzree Subscribed to comments via email
2008-03-05 15:26:13

at the moment, i prefer to maintain everything separately. hmm talking about sitemap and robot.txt, i haven’t got any robot.txt on any of my site yet.

 
2008-03-05 15:58:20

[...] Original post by Blogjer is about bloggers [...]

 
Comment by seo lad
2008-03-06 00:46:44

They should just agree on a common platform. That would make things easier.

 
Comment by MK
2008-03-06 00:56:56

uhu… SEO… those three letters bothers me, having to re-check you every move to make sure it follows ‘the right’ SEO.
nyway, since this is more on the technical site, i guess i’ll have to do it.
thanks for the info.

 
Comment by nUUr
2008-03-06 05:41:22

huhu..xpham jgk..erm nape kekadang xleh komen kat cni yek

Comment by zaki
2008-03-06 22:08:11

Let me know if you’re unable to do so. I check my spam folder everyday

 
 
2008-03-06 10:12:28

[...] Read the rest of this great post here [...]

 
Comment by Josef
2008-03-06 20:57:48

You’re right, I’ve tested several sites and the robot.txt and xml sitemap can make a big difference in how deep and often bots crawl.
It’s so easy to put the robots.txt files in the folder where they go that I don’t see much advantage in putting them all on one domain.

Comment by zaki
2008-03-06 22:10:18

The advantage is only seen, if you have many websites, where you have to manually ftp or login into cpanel one by one

 
 
Comment by alzack
2008-03-07 06:06:27

good info

 
Comment by Raymond Chua Subscribed to comments via email
2008-03-07 09:07:37

Too deep for a newbie like me. :)

 
Comment by kreauter Subscribed to comments via email
2008-05-04 14:55:07

great information thank you! so now i am able to add to live google and yahoo. but for ask.com you need the reference in robots.txt. when i had in my file the following: http://www.rokdd.de with Sitemap file alle-seiten.xml a server error occurs.. any hints? thank you :)

Comment by zaki
2008-05-04 15:32:13

What is the error message?

Btw, the declaration of you XML in your robots.txt is incorrect. It should be this way

sitemap: http://rokdd.de/alle-seiten.xml < - correct
rokdd.de with Sitemap file http://rokdd.de/alle-seiten.xml <- wrong

And you seems disallow any of you site to get indexed as well

User-agent: *
Disallow: /

 
 
Comment by kreauter Subscribed to comments via email
2008-05-04 16:07:51

okay wow fast response :)

the first syntax i already know but i want to use the robots.txt for more as one domain. i guess that the following syntax should be correct:

rokdd.de with Sitemap file alle-seiten.xml

however google said that is not understood that syntax :(

thanks for your help

 
Comment by Technology Transfer
2008-05-10 15:39:21

Took me a while to successfully create a cross-domain sitemap and I had to ask for help for it, but it’s very interesting to see this working so fine. Anybody else managed to do this by himself?:)

 
Name (required)
E-mail (required - never shown publicly)
URI

Subscribe to comments via email
Your Comment (smaller size | larger size)
You may use <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong> in your comment.