How do you deal with Crawl Budget?

How do you deal with Crawl Budget?

What is Crawl Budget?

Crawl Budget is the total number of pages on your site that search engine crawlers can crawl and index in a given amount of time. The higher the crawl budget you have, the more content you can rank for in the search engine results pages (SERPs). Content-rich websites with lots of information will be able to take advantage of this, while sparsely populated websites will not be as successful.

In the world of search engine optimization (SEO) if the search engines are not able to crawl your content it won’t be indexed therefore it won’t appear in the Google SERPs. So if Google doesn’t spend enough time to crawl your site, it’s less likely that it will rank well for certain keywords.

Each time that your site is crawled, it takes up part of the crawl budget for that given day. The number of pages crawled per day is proportional to your site’s importance to Google, and crawls are done in priority order so if your website is not very popular a search engine may not want to crawl your site very much. 

Should I worry about my crawl budget?

If the Google bot crawls a lot of your pages on a daily basis you don’t have to worry too much about your crawl budget. But if you have lots of pages not being crawled or updated as often as you’d like, you have to do something to make sure your site can support additional crawling and increase your site’s crawl demand. 

To quickly check whether you have a crawl budget issue or not do the following:

  1. Find out how many pages you have on your website – probably the easiest way to do that is to check your sitemap.
  2. In your Google Search Console in the Crawl Stats report (go to ‘Settings’) determine how many pages are crawled on average every day.
  3. Divide the number of your pages (point 1) by the number of average crawled pages per day (point 2). If you end up with a number higher than 10: yes you have a crawl budget problem. If you end up with a number lower than 3 you don’t have to think about your crawl budget. 

How to optimise your site for a better crawl budget

Crawl Budget is the amount of time Googlebot spends crawling your website. How much crawl budget you have will affect how quickly your site ranks for certain keywords. The more pages that are returned in the search engine’s index, the higher you rank for those keywords.

The first thing to know about the crawl budget is that it’s not something you can control. You can’t pay for a higher crawl budget, and there’s no way to ask Google for a specific number of crawls per day. That being said, it’s important to understand that there are certain things that you can optimise in order to “force” the Googlebot to crawl more of your website pages. 

  • Make sure all your important pages are crawlable and not blocked in your robotx.txt file. On the other side, block pages in your robots.txt file that shouldn’t be crawled thus wasting your crawl budget. The same goes for non-essential resources like gifs, videos, and images that take up a lot of memory, but are often used for decoration or entertainment and may not be important for understanding the content of the page. In your robots.txt file to disallow individual resources by name simply use: User-agent: * Disallow: /images/filename.jpg
  • Whenever possible build your pages using HTML only. Data heavy JavaScript websites use more of your budget crawl than plain HTML sites. If JavaScript is your thing if possible prerender your pages in static HTML that Google can easily see and read, taking up way less of your budget.
  • Always keep your Sitemap up to date: sitemaps help the search engines easily understand the structure of your website and which pages are meant to be indexed. 
  • It goes without saying that reducing your redirect chains will greatly improve your crawl budget. 301 and 302 redirects are unavoidable but try to use them no more than twice in a row as some bots may even stop following redirects if they encounter an unreasonable number of 301 or 302 redirects in a row.
  • Increase your page speed as slow pages eat up valuable Googlebot time. The quicker your pages load the more of your pages will be visited and indexed by the Googlebot.
  • Link to all your important pages internally as the Googlebot prioritises pages that have lots of internal links pointing to them.
  • Last but not least avoid duplicate content as we all know that Google doesn’t want to waste time and resources by indexing multiple pages with the same content.

Who is Joanna Beech

Joanna Beech has almost ten years of experience working specifically in the arena of Digital Marketing with a strong focus on SEO. She has the insight of working both in-house and agency side, responsible for creating and launching SEO strategies across a broad range of industries and company sizes. Joanna has gained extensive hands-on experience in SEO whilst also incorporating all PPC channels, user experiences, online tracking, and online behaviour analysis.

In this talk, Joanna Beech (Digital Marketing Manager – SEO, Deloitte)

 addresses the crawl budget issue for medium or larger websites that publish rapidly changing content and where new content is not being crawled or indexed. In this session, Joanna will give you a good explanation of what Crawl Budget is and techniques to ensure every crawl is optimised (to make the most of each).

Dixon Jones with Natalie Arney and Joanna Beech at BrightonSEO
Dixon Jones with Natalie Arney and Joanna Beech at BrightonSEO

About BrightonSEO 

BrightonSEO – is a major search marketing event in the UK. One of our favourite events of the year, This is a superb conference for search marketing professionals, novice or expert. BrightonSEO is a chance to learn from some of the best minds in search, and then rub shoulders with them at one of the friendliest, and largest, gatherings of Digital Marketers in Europe. 


Posted

in

by

Comments

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.