How can I get my site on Google search?
I have answered that question at least 10 times on Yahoo Answers and several times over email. Instead of responding to yet another question individually, I thought of creating this reference post – and time permitting a series itself – here on this blog.
How people discover your blog or website?
Before looking into the technical details on what it takes to increase the search visibility of your blog, one should understand how people discover your website content or blog posts on the web. Unless somebody already knew about your website, there are only three major ways for a total stranger to find you on the web and they are:
#1 Advertising: Online/offline advertising, Exchange advertisement programs, PPC (Pay-Per-Click) advertising, SEM (Search Engine Marketing), Article marketing etc
#2 Referral links: Blogrolls, Directory listing, Forums, RSS/Feed directories, Social bookmarks, Social network links etc
#3 Search Engines: Via Google, Yahoo, MSN, Ask etc.
Off the above three, the most popular method used by people to find information on the web is via search engines. It, probably, is also the most sustainable model for long term to drive traffic to you website or blog.
Now the process of optimizing your site content to make it search engine friendly and easier to spot by the search engine bots is what is known as Search Engine Optimization (SEO).
How Search Engines find information?
The World Wide Web (WWW) in principle is a huge repository of millions of interconnected PCs and information stored in them. So, in order to get your blog or website content to be probed and indexed by the search engines, they need to know about your site’s existence first. This can be done via search engine submission for newer sites or let the search engines automatically find your site via reference links to your site from other sites. In either case, ultimately the search engine spider (a piece of software that examines your site for its content) needs to crawl your site and index its content for the use of search users. Now, there are two entities on your site that helps these spiders probe your site effectively. They are:
#1 Robot.txt: Once your website is submitted to the search sites, the search engine spiders (also known as crawlers or bots) will attempt to navigate your site to understand its content. A robot.txt file maintained on your website will tell the spiders which all folders and modules on your web server are allowed to be crawled by them. (The format and composition of robot.txt is not in the scope of this post)
#2 Sitemap.xml: Once the robot.txt allows spiders to crawl certain areas of your server folders, the sitemap pitches in to provide more details. The sitemap.xml file contains all your pages, posts, links to tools and web applications etc that need to be made visible to the public users (and hence to be indexed by the search engines). Sitemap.xml will also contain more details on these pages such as when it was last changed, the priority of the page etc that help spiders to decide on when to index them.
Basically robot.txt provides folder/file level access details and sitemap.xml provides detailed information on pages. These two files are essential for any website that is search engine friendly and they are usually maintained at the root folder of your website.
Page & link level instructions to the spiders
In addition to the above to files, there are page and link level instructions that help spiders to decide whether to index a page or a target page that is pointed to by a link on the page (URL). These are page meta tags and link properties and I shall talk about them in another post soon.
- Sitemap.xml is usually submitted to search engines via tools such as Google Webmaster Tools. Even robot.txt can be analyzed sample robot.txt created there in
- Submiting your website or blog to search engines can be either done manually per search engine or done via bulk submission services (paid as well as free) offered by certain sites
- Submission of your website does not necessarily guarantee that your site will be appearing in search results immediately. There are other aspects to prioritizing your site contents’ weightage