Google Webmaster Tools Sitemap Optimization – Part 5

Before proceeding with our fifth article in the Google Webmaster Tools series I just want to provide some results on actions taken since our first article (Keyword significance). This blog’s keyword significance now looks as shown in the picture below, which is very good for what this blog’s contents – things are working!

keyword-significance

Now let us move on to today’s topic which all about optimizing the sitemaps.

Diagnose Sitemap Issues

As usual, login to your Webmaster Tools account and check the Sitemap status in the dashboard itself. If it is a Red X instead of a Green ‘Tick’ you have a reason to worry.

The most reasons for the red cross status to appear on sitemap are:

  • The sitemap has become unreadable by the bots and in that case you have to make sure that you have the right chmod for sitemap
  • Your sitemap is not XML compliant or sitemap building process was somehow aborted. Check your sitemap generation tool or plugin and rebuild the sitemap once
  • You have scraped or copied content and the Google bot refuses to index it. Stay away from copy-paste content or autoblogging in that case
  • Sitemap download taking time and the bot ditched it half way. Use a gz version of the compressed sitemap or even optimize it as per the next section below
  • You have opted for a non www version of the preferred domain via the Webmaster tools settings but the sitemap links are still build with www. Make sure that your canonical URL, WordPress blog URL and Webmaster tools preferred domain settings are consistent.

If you take care of the above issues, the Sitemap status should now show green.

To move to the next part of the diagnosis, you have to click the Site Configuration -> Sitemaps link to get a screen like the one given below.

webmaster-tools-sitemap

Here you have to check if there is a significant difference between the URLs submitted and URLs in the web index. In my case, it is a perfect scenario whereby I have 392 URLs submitted (362 posts + 8 pages + 1 homepage + 2 DS tools URLs + 19 categories) out of which 373 are indexed because I have ‘noindex’ ROBOTS meta specified on categories. You have to work out the same math by getting the numbers from your WordPress admin panel and cross check if they match.

If there are differences then you have to check if noindex is specified on your posts, pages or additional links. If your new URLs (new posts, pages) doesn’t appear in the web index a few minutes or an hour after your post is published, then you have to check your WordPress ping list and settings there in. You can check if your site URL is indexed in Google web index by typing in the following in Google Search:

site:http://www.YourDomain.com/

or

site:http://www.YourDomain.com/a-post-slug/

Note: Use without www if your preferred domain is without the www prefix

If the situation doesn’t improve with the ping list changes, you may want to increase your crawl rate by changing Crawl rate values in the Webmaster Tools Site Configuration -> Settings -> Set custom crawl rate.

Please note that the above settings change can increase the load on your hosting server.

Sitemap Optimization

1. Google XML sitemaps is an amazing plugin to generate your sitemap. However, at times when your website grows in terms of number of pages or articles, you may want to minimize your sitemap size as small as possible. If you have checked Google’s own sitemap you will notice that it doesn’t bother about a visual styling sheet or some other parameters that the Google XML sitemap adds. It is not bothered about the Change Frequency, LastChange time etc.

You may want to remove this excess information via editing the plugin files if you have the expertise to do so.

Basically this exercise reduces your sitemap file size by half. A smaller sitemap file means faster crawling and indexing.

2. In addition, in the sitemap plugin settings page you have to make sure that you generate a .gz version of the sitemap (sitemap.xml.gz) and refer the same in robots.txt.

3. Adding other pages (not WordPress generated) to the sitemap is the third step and keep the priority of all your pages to 0.5 and homepage to 1.0 as defaults.

Once you have done the above steps, your sitemap is optimized for its physical size and content.

Let me know if you have any queries on this particular post and I will be glad to help you.

Part VI: Webmaster Tools Google Guidelines

Happy Site Optimization!

Comments

  1. anbu :

    sir i have blog but we don’t generate the sitemap if generate sitemap than which place we upload the sitemap.xml file sir

  2. I used to use the google xml sitemap plugin, but as it’s no longer updated and getting pretty out of date, I found a script on the web somewhere and basically produce a really simple sitemap every time I produce a new post.

    Hadn’t realised that the change frequency and last change info was irrelevant to google – it makes sense to remove that then.

    Good info, thanks.

  3. For every time generating sitemap. we can update our google webmaster as new sitemap. please suggest some ideas.

  4. Anuj :

    Thanks for the great tips! I have been building websites for fun for quite some time but just recently started trying to optimize search results for my latest masterpiece. I’ve learned several good ideas, thank you.

  5. Ajit, Any priority values you suggest for WordPress users .. using the xml-sitemap generator plugin.

    • 1.0 for home page and every sub domain that you have. 0.50 for posts (min and max) and static pages. Avoid auto prioritizing based on comments etc. This is what I am doing (and Google’s own sitemap has only 1.0 and 0.50 values)

  6. sahil :

    Hey Ajit,
    It have been over a week but google has still not accepted my sitemap its showing me a clock (that comes when we add a fresh site map), No Red Or Green Sign what should i do. I have tried resubmitting the sitemap as well

  7. I’ve submitted my site many times via webmaster tools for a few days now but it keeps showing Submitted: 8 and Indexed: 6. I’ve checked and there’s no “noindex”. How do I get the other 2 pages indexed by google.

    Thanks

  8. Thank you. You are sharp. I didn’t notice it even though I checked ample times. I’ve done the correction and posted the sitemap again. Hopefully it will get indexed soon. Thanks

Speak Your Mind

*