SEO 101: Is Your Website Getting Crawled Properly?

by in SEO

We all want our site to be the highest ranking website for our keywords, and we try everything possible to improve our site’s online performance.

Most of the techniques we use are centered on creating exciting and valuable content for the online user, which is largely as it should be.

But even the finest online content can fail to improve a website’s online performance if the search engine robots can’t effectively crawl and index your website.

Crawling SEO

So how can you ensure that your website is easily crawled and indexed by Google, Bing, and other search engines?

How do you make sure that your stellar content is not being missed or ignored by the search bots?

Creating a robot friendly XML sitemap, that’s how.

What is an XML Sitemap?

XML sitemap

An XML sitemap is simply a data file that contains a list of all of the pages within a website. Search engine robots use this sitemap to identify pages so that they can be more easily indexed for search results.

While the use of a sitemap does not guarantee better page rankings, it does ensure that your website can be easily crawled by the search bots.

If the search engine robots can’t crawl your site, or if it takes too long for them to crawl and index your site, your overall online performance will suffer.

So let’s look at some basic tips for creating a robot friendly XML sitemap for your website.

How the Heck Do I Create an XML Sitemap?

There are basically two ways you can create your sitemap…

  1. Use Software – In this category, there are two options that are at the top of my list. The first is XML Sitemaps and the other is Screaming Frog. Either one will create a proper XML sitemap for up to 500 pages with the free version. If you just need the sitemap, go with XML Sitemaps, but Screaming Frog will go far beyond and show you site errors.
  2. Use a Plugin – If you’re using WordPress, I like this Google XML Sitemaps the best. Even though most WP sites have Yoast running for SEO that can handle your XML sitemap needs, I turn it off and run this plugin instead because it allows you to prioritize your pages to the search bots.

Split Your Sitemap into Categories

XML sitemaps are restricted to 10MB or 50,000 URLs. While the search bot can easily read files of this size, it can not do so quickly.

The longer it takes a search engine robot to crawl your site, the longer it will take to index its content and return a page ranking for a given search.

If your website is fairly large, you can create a number of different, and smaller, sitemaps making it easier for the search bots to crawl.

You might create a general sitemap for all of your top pages, a secondary sitemap for your internal pages, and another for your post pages. The Google XML Sitemaps plugin will do this for you.

This not only makes it easier for the search bots to crawl, but it makes it quicker and easier to update when any changes are made to your website.

The Robots.txt File

The robots.txt file goes hand-in-hand with the crawling process of the search engines, and it essentially tells the search engine bots what pages to crawl and what pages to ignore.

The information you include in your robots.txt file will ensure that the search bot crawls your website along the easiest of all possible paths. You can also use the robots.txt file to instruct the search bot to ignore (ie. noindex) any specific areas of your website. Take a look at the official site for more details.

Be sure to include links to all of your sitemaps in your robots.txt file to ensure that the most important areas of your website will be accurately crawled and indexed.

A typical WordPress robots.txt file might look like this:

User-agent: *
Disallow: /wp-admin/
Disallow: /wp-includes/

If you’re not on WordPress, you can create the file manually and place it at

Removing Broken Links

There are many sitemap generating tools that can scan your website and automatically produce an XML sitemap. These tools are easy to use, and can save a lot of time. The added benefit is that a tool like Screaming Frog will show you broken pages and links that result in a 404 “page not found” error.

When these error pages end up in your XML sitemap they can throw off the search bot, and significantly slow down its progress.

Error pages are also something that Google and other search engines dislike, so it is important to identify these broken links and remove them as soon as possible.

Once you have generated a sitemap, take note of the broken pages and make sure to 301 or 302 redirect all broken links.

I have My XML Sitemap, Now What?

Once your sitemap is complete, store it in your website root folder, just like the robots.txt file.

You will also want to submit it to Google Webmaster Tools using the sitemaps tab on your dashboard. You can re-submit your sitemap at any time to coincide with any changes to your website.

Creating an XML sitemap for your website makes it easier for the search engine bots to crawl and index your content. While this won’t guarantee a boost in your site’s page rankings, it is the first step to getting noticed by Google and other search engines.

Remember, if the search bots find it difficult to crawl your site, even the best content in the world will not help your page rankings.


Please, rate this post:

1 Star2 Stars3 Stars4 Stars5 Stars (2 votes, average: 5.00)