When it comes to making sure your business and website get accurate rankings on Google’s search results, one key way is ensuring your pages are optimized for crawling. Google uses Googlebot to crawl pages to categorize content. However, with millions of new pages hitting the web every day, even Google is hard pressed to index them all. So how can you make sure that Google properly curates your pages?
On Monday, January 16, 2017, Google released a blog post clarifying a term that SEO professionals use a lot: crawl budget. When we refer to a “crawl budget,” we are generally talking about the number of pages that Google will look at when they visit your site in a day. If your site only has a handful of pages, then you probably aren’t worried about this. However, you should still be thinking about it to avoid crawling errors. Likewise, if you’re an online retailer with thousands of product pages, making sure that your pages get updated in search engines is important. That’s why it’s vital that your website is optimized for crawling. Here are three quick tips to help you do that.
Ensuring Your Site is Optimized for Crawling
1. Remove Low Value URLs
A low value add URL is a page that will negatively affect how the crawler curates your page. These include soft 404 errors, infinite space pages, and duplicate pages caused by faceted navigation. These sorts of URLs essentially waste the crawlers time and wastes the number of pages that are indexed. This in turn can prevent your site from being indexed properly.
A soft 404 error is when a page returns a 200 response code for a non-existent page as opposed to an accurate 404 page-not-found error. This page has no content and so adds no value to your SEO ranking, but still consumes crawl budget.
Faceted navigation happens when your eCommerce site has various products that are sorted by size or color and point to unique pages per this sorting criteria. It’s easy to then have pages that duplicate existing pages through errors in PHP coding. These duplicate pages have no value of their own because they duplicate existing pages.
Infinite space pages are pages that have no original content of their own, but instead just have links to other pages. Googlebot will follow each of these links and consumes bandwidth while adding little value.
2. Adjust Your Site’s robots.txt file
This text file is one of the most powerful tools you have in your arsenal when it comes to controlling how your site is indexed. When crawlers access your site, this file tells them what parts of your site they should not visit, preventing necessary pages that don’t add value from being indexed. Additionally, you can set up different parameters for mobile crawlers vs. desktop crawlers, making your crawl budget more efficient.
Googlebot and other crawlers don’t do calls to action, such as “log in” or “Add to Cart.” So, by using this file to eliminate log in screens and checkout pages, you’re going to increase the number of pages that the crawlers do visit and index. Keep in mind that despite what you may have heard, “crawl-delay” in your robots.txt file is not processed by Googlebot.
3. Clean Up Your Existing URLs
Handling how your website is optimized for crawling is an important part of both web design and SEO management. With the right design, your site will be indexed quickly and efficiently, even if your company spans thousands of unique URLs. If you haven’t reviewed your site design in a while, your SEO ranking could be improved just from these three simple tips.
Here at Darxe, we specialize in helping businesses make their web presence more efficient. Let us review your site and suggest improvements that can help both customer accessibility and reach. With our code and design team helping you, your business can be better than before.