Today, let’s see more about sitemaps. Every webmaster must have a sitemap ready for his site and submit it to google in order to get all the pages listed on Google.Sitemaps are of two types, as you know the HTML sitemap you use to navigate a site and second the sitemap used to help crawlers crawl the pages more effectively.
Why are they necessary?
Sitemaps are not necessary.(Yep i said that) Even if you don’t have sitemaps the crawlers will crawl your pages and find the content. But, it is like letting them crawl in a dark room. What if you had a well lit room with all navigation and helpers around which will take them to each room? It will be more effective right? Sitemaps serve this purpose.
It has the site structure ready giving indication to the crawlers as to which are the folders/files that are important, which are not, which are the folders/files that are to be visited frequently, and which are the ones to be visited only once. This helps the crawlers to undersand your site more effectively.
Now, how to build a sitemap for blogger?
It’s very simple in Blogger. It only requires you to go to the Google Webmaster Central and ass your site feed and the sitemap is automatically created. You can get detail instruction on this here.Make sure that you submit your full feed and not partial one.
Which is the best sitemap generator program around?
There are lot of free online and downloadable sitemap generators.
Here’s a simplified listing of what is best.
1- Python Scipt - This is the most difficult one to install. But if you are familiar with python, then this is the best one around.It’s automated and requires no additional support.I don’t recommend it for a beginner.Requires technical knowledge.
2 - Online sitemaps - This is best for small websites. It’s easy, simple and online.Just go to this site and submit your url.Fill in some basic details like time and priority settings for the files and click go!The whole sitemap will be generated online.You will get both ROR file and the Google sitemap XML file.If you are interested only in Google, use the XML sitemap.The format is according to Google sitemap protocol and is faultless.
Best choice for beginners and small websites of less than 500 pages.
3 - Gsite Crawler - This is a downloadable application. If your website is a bit large and you have time to tweak some settings and is serious about sitemap, then i would recommend this guy for you.
It requires you to give the website url, then select the types of files to be scanned from it, priority settings are automatically detected, and you can create bot Google sitemap and Yahoo url.
It has report generation as well that will give you an idea of how many urls were crawled and broken links etc.This is very useful while handling large sites.
How to make sitemap for large sites?
If you have really large websites for instance a one million page one, then it’s really going to be tough creating a sitemap. Practically this is possible with the Python script but if you are not okay with the technical stuff then you got to depend on sitemap generator programs.(If you don’t have a really large website the follwing piece of information may not help you.)
Step 1 - Download a free sitemap generator program like Gsite crawler.
Step 2 - Use it to crawl each folder of your website as separate projects.Make sure that you create a new database each time a new project is opened.
Step 3 - Now you have separate sitemaps for each folder.
Ex:- yourdomain.com/folder1 has a sitemap called folder1.xml and yourdomain.com/folder2 has a sitemap called folder2.xml
Step 4 - Download this simple index generator program.
Step 5 - Copy paste all the folders (containing the sitemaps) from thh projects folder of Gsite crawler(C:program files…) and put it into one single folder.
Step 6 - Run the index generator program against this parent folder.
Step 7 - Now a sitemap index would be created with links to all the child sitemaps but one problem, since in Gsite Crawlers projects folder(C:Program Files) each crawled folder will be named with underscore replacing the forward slash.
Ex:- yourdomain.com/folder will be named as yourdomain.com_folder
Therefore the sitemap index produced will have the links too this way.
Step 8 - Use notepad/wordpad to open the sitemap index file. Find and replace all the underscores with forward slash.
Step 9 - Upload the child sitemaps in the respective folders online.
Ex: - yourdomain.com/folder1..folder2 etc.
Step 10 - Upload the sitemap index file to the root folder and submit it to google.
Bingo! There you go you have now created a sitemap index and child sitemaps for a large website. Now submit it trough the webmaster central window and keep waiting!
Related SEO Tips and Articles:
[...] - one for human eyes and the other for crawlers and search engines. We’ve already discussed how to create a sitemap for search engines.Now, let’s create a sitemap page in HTML for user’s visual [...]
[...] How to build a sitemap for large websites and blogs - Building a sitemap for a small blog or site is easy. But if you have a large website, it turns messy. Learn how you can still get a great sitemap ready without mess. [...]
[...] do with your permalink structure. All you have to do is create a sitemap and submit it to google. You can find the instructions here.Within a few days (according to your crawl frequency) all of your pages will be indexed on [...]
What if its a dynamically updating site, and we want a sitemap? i.e. Its a site which is updated, say 20 times a day, but is not a blog, just a custom designed website. Is there an automated script available? Or what are the steps to keep in mind while making such an automated sitemap script?
Mohan’s last blog post..Knol… from Google. Art thou game?
Reply to this comment
Mani Karthik Replied:
Mohan, for a dynamically updating site, be it php, asp or anything..using a MOD Rewrite to make SEO friendly pages itself is half the task done. Then depending upon which pages carry important information, we can automate the sitemap generation process. For this Google suggests a Python script that can be installed in your servers. It has the risk of slowing down your server because of automation, but if you want to, give it a try.
https://www.google.com/webmasters/tools/docs/en/sitemap-generator.html#download
Reply to this comment
[...] by paying me a visit often and commenting on every post I make? He was suggesting about themes, sitemaps, permalink structures, traffic….TRAFFIC! And do I have traffic in here? Or in here? It seems that [...]
This and other topics here have helped me a lot. Thanks. There’s still a lot to improve with my blog but I’m slowly learning. And there’s still a lot to learn.
mama meji’s last blog post..So You Want To Switch from Blogger to WP…
Reply to this comment
thanks
Reply to this comment
Thanks for the tips on sitemaps.
Reply to this comment
[...] SitemapNow, assuming that all the above steps are taken and optimized without overdoing it, you are ready to submit your site to Google. For this create a sitemap. [...]
thanks for that. What if you have giant size website, where pages will be auto-generated based on query of the users. And then you have millions of these pages? what will be the size of the sitemap file? and no matter what sitemap you create, you still dont get it complete because pages will be dynamic.
Reply to this comment
@Peter, Dynamic pages with “non indexable” URL’s are clearly out of the brackets - to be indexed. In such cases, we’ll have to use MOD rewrite to convert those dynamically generated pages to indexable format. And then run the sitemap generator program.
Once when the pages are huge in number, you’ll have to do them in batches and link the DB to an SQL database, or the program won’t hold it.
No matter how many pages you’ve got, it’s possible to get them all indexed in batches. At Alamy.com we’ve got millions of pages indexed with the same logic.
Reply to this comment
Very, Very Useful Tips, Information. Thank’s a Lot.
Reply to this comment
[...] - Link exchanges give you authority - NO they don’t. It’s a worthless exercise. - Google Page Rank decides your rank on Google results - NO it doesn’t. I can rank a PR1 website over a PR 5 with less effort for a selected keyword. - If some porn site (bad neighborhood) links to you, you are in trouble - NO it does not. It will not help you and it will not harm you either. - If you get anyone linking to you, it’s good - NO, you should be selective as to who links to you (if at all you decide who links to you). - The more the links the merrier - NO, It’s not the numbers but the quality of the links that matters. 6. Sitemap Now, assuming that all the above steps are taken and optimized without overdoing it, you are ready to submit your site to Google. For this create a sitemap. [...]
Mani, the link on number 2 is not working.
Reply to this comment
Mani Karthik Replied:
Thanks Ryan. Fixed now.
Reply to this comment
thanks too mani, built my site map already. nice guide.
Reply to this comment
[...] How to build a sitemap for large websites and blogs - Building a sitemap for a small blog or site is easy. But if you have a large website, it turns messy. Learn how you can still get a great sitemap ready without mess. [...]
Thank you for this!
Reply to this comment
Have to try now. This post seems the better one to build a sitemap.
Reply to this comment
[...] have been suggesting time and again that providing a sitemap with the proper information is probably the best method to ensure that all the pages in your site are indexed on Google. There are many ways to create a sitemap too. Here is a [...]
Thanks for the thorough and detailed information.
Reply to this comment
Thanks for the tips. I’ve been facing difficulties in creating and installing sitemap for my blogger account. Unlike to wordpress that it’s easy and simple and there are plugins for this purpose
Reply to this comment
[...] a sitemap for your site using either of these methods 1, 2 and [...]
I can’t begin to tell you how much putting a sitemap on our site helped with SEO - it made a massive leap in keyword search rankings.
Reply to this comment
[...] a sitemap for your site using either of these methods 1, 2 and [...]