What are Sitemaps?
As defined by sitemaps.org, organization that standardized the sitemaps protocol, it is one of the methods to inform search enigne about the pages in your site. There are many ways to do that and Sitemaps are one of them. Sitemaps can be in various formats like html, php or the most popular format xml. XML sitemaps are widely accepted standard by most of the websites and are used on small as well as very large websites with thousands of links listed in sitemaps. It lists urls of a site along with othre meta information like the date published and the importance of the page within the sitemap or website. It also contains the information about how frequently a page listed in the xml sitemap changes and how to inform about it to various search engines like Google, Yahoo, Bing, Ask, etc.
It is not guaranteed that using the sitemaps you get listed in all these search engines, it is just a method to remind them about the pages and whether they are updated/added or not.
Why Sitemaps standard is needed?
Sitemaps standard or protocol provide a basic communication link or channel between the search engines spider/crawler and the websites sitemap. Before the protocol html sitemaps were used and are not upto the mark as far as performance was concerned. Theoretically search engines need to use less processing power as they have the overview of the site available with them and can help to find them all the urls of a website with ease and risk of leaving the site before it is crawled wholely is reduced.
Sometimes important pages were left behind by the spiders and useless pages were indexed this was the biggest reason behind the stiemaps protocol first developed and accepted by google and than adopted by other search engines too.
XML Sitemaps Format:
Sitemap protocol is consisting of xml tags that are used as the basic structure of the sitemap file, and all the values (data) in the sitemap must be entity escaped and the sitemap file must be encoded in utf-8 format only. There are few other must haves that you need to keep in mind while writing your sitemap or creating a tool or plugin to create sitemaps automatically for CMS like joomla or drupal or blogging platform like wordpress. They are listed below as described on Sitemaps Protocol site:
- A sitemap must begin with an opening
<urlset>tag and end with a closing
- You need to specify the namespace (protocol standard) within the
- Must need to include a
<url>entry for each URL, as a parent XML tag and include a
<loc>child entry for each
- All other tags are optional and support varies for different search engines like yahoo and google.
- All the urls listed in a particular sitemap must be from a single domain like example.com as an example.
For more information about sitemap protocol and sample sitemaps and xml tags please visit this link at sitemaps.org
Sitemaps standard and Search Engines:
In early 2007 all major search engines agreed to follow the sitemap xml standard adopted by google and created a website sitemaps.org to give more information about the standard and protocol used for the xml sitemaps. Yahoo is now accepting sitemaps in xml format along side its .txt format in yahoo site explorer. Bing also have started to accepting sitemaps in xml format after some initial testing for the format. Though sitemap can help find search engines hidden content but it should not be created and used for this sole purpose because it may send a wrong message to search engines and your site may get banned from search results too.
Before the sitemaps protocol was invented it was difficult to both webmasters and search engine spiders/crawlers to keep track and record of the various pages and links on the website but this protocol has made it very simple for site owners to create and update their sitemaps. There are various plugins available like Joomap for joomla and Google XML Sitemaps for WordPress self hosted blogs. This is a win-win situation for both webmasters and search engines as less resources are used to index the website and all the links from it with the help of this protocol.