How to Optimize Your Website’s Crawl Budget
Nick Eubanks
10 Mins Read
We’ve run over 100 technical audits this year.
Through this we’ve gained deep insights into how technical structure impacts a website’s performance in search.
This article will highlight the most common technical SEO issues we encounter and which have the largest impact on organic traffic when corrected.
This happens quite a bit on eCommerce sites. When a product is removed or expires, it’s easily forgotten and the page “404s”.
Although 404 errors can erode your crawl budget, they won’t necessarily kill your SEO. Google understands that sometimes you HAVE to delete pages on your site.
However, 404 pages can be a problem when they:
The best practice is to set up a 301 redirect from the deleted page into another relevant page on your site. This will preserve the SEO equity and make sure users can seamlessly navigate.
When launching a new website, design changes or new pages, there are a number of technical aspects that should be addressed ahead of time.
Google has confirmed that website speed is a ranking factor – they expect pages to load in 2 seconds or less. More importantly, website visitors won’t wait around for a page to load.
In other words, slow websites don’t make money.
Optimizing for website speed will require the help of a developer, as the most common issues slowing down websites are:
Google’s index is officially mobile first, which means that the algorithm is looking at the mobile version of your site first when ranking for queries.
With that being said, don’t exclude the desktop experience (UX) or simplify the mobile experience significantly compared to the desktop.
Google has said it wants both experiences to be the same. Google has also stated that those using responsive or dynamically served pages should not be affected when the change comes.
An XML Sitemap lists out URLs on your site that you want to be crawled and indexed by search engines. You’re allowed to include information about when a page:
While Google admittedly ignores a lot of this information, it’s still important to optimize properly, particularly on large websites with complicated architectures.
Sitemaps are particularly beneficial on websites where:
As your website grows, it’s easy to lose track of URL structures and hierarchies. Poor structures make it difficult for both users and bots navigate, which will negatively impact your rankings.
A Robots.txt file controls how search engines access your website. It’s a commonly misunderstood file that can crush your website’s indexation if misused.
Most problems with the robots.txt tend to arise from not changing it when you move from your development environment to live or miscoding the syntax.
It’s no longer a good idea to crank out pages for “SEO” purposes. Google wants to rank pages that are deep, informative and provide value.
Having too much “thin” (i.e. less than 500 words, no media, lack of purpose) can negatively impact your SEO. Some of the reasons:
In addition to “thin” pages, you want to make sure your content is “relevant”. Irrelevant pages that don’t help the user, can also detract from the good stuff you have on site.
This is particularly important if you have a small, less authoritative website. Google crawls smaller website less than more authoritative ones. We want to make sure we’re only serving Google our best content to increase that trust, authority and crawl budget.
A canonical tag (aka “rel=canonical”) is a piece of HTML that helps search engines decipher duplicate pages. If you have two pages that are the same (or similar), you can use this tag to tell search engines which page you want to show in search results.
If your website runs on a CMS like WordPress or Shopify, you can easily set canonical tags using a plugin (we like Yoast).
We often find websites that misuse canonical tags, in a number of ways:
This is significant, as you’re telling search engines to focus on the wrong pages on your website. This can cause massive indexation and ranking issues. The good news is, it’s an easy fix.
As well as your robots.txt file, there are also robots tags that can be used in your header code. We see a lot of potential issues with this used at file level and on individual pages. In some cases, we have seen multiple robots tags on the same page.
Google will struggle with this and it can prevent a good, optimized page from ranking.
It’s a challenge for Google to crawl all the content on the internet. In order to save time, the Googlebot has a budget it allocates to sites depending on a number of factors.
A more authoritative site will have a bigger crawl budget (it crawls and indexes more content) than a lower authority site, which will have fewer pages and fewer visits. Google itself defines this as “Prioritizing what to crawl, when and how much resource the server hosting the site can allocate to crawling.”
Check out our detailed guide on how to improve crawl budget
Internal links help to distribute “equity” across a website. Lots of sites, especially those with thin or irrelevant content tend to have a lower amount of cross-linking within the site content.
Cross-linking articles and posts help Google and your site traffic moves around your website. The added value of this from a technical SEO perspective is that you can pass equity across the website. This helps with improved keyword ranking.
Title tags and metadata are some of the most abused code on websites and have been since Google has been crawling websites. With this in mind, site owners have pretty much forgotten about the relevance and importance of title tags and metadata.
With Google becoming more sophisticated and offering webmasters the ability to add different markup data to display in different places within their sites, it is easy to see how schema markup can get messy. From:
It is easy to see how this can break a website or just get overlooked as the focus is elsewhere. The correct schema markup data can in effect allow you to dominate the onscreen element of a SERP.
As search engine algorithms continue to advance, so does the need for technical SEO.
If your website needs an audit, consulting or improvements, contact us directly more help.
This site uses Akismet to reduce spam. Learn how your comment data is processed.
See how partnering with us at From the Future can help build your business.
Couldn’t agree more. Technical SEO if done right saves google crawl time and storage (ie page load speed and caching irrelevant pages). So if you clean this up google will reward you for doing so!!
Agreed – thanks for commenting James!
What would be the implications of redirect chains? Does one 301 redirect pass more juice than three 301 redirects?
Google does NOT crawl a redirect if it’s in a chain of 5 or more. In other words, important pages can get lost if they’re redirected multiple times.
Great post Ryan! I’ll have to bookmark this for later.
Please do 🙂 Thanks for commenting
Awesome article Ryan. Bookmarked for our agency.
Thank you Louis!
Always a pleasure to read these posts!
Thank you Gerald!
Nice post! Particularly like the internal linking fic instructions. Will make sure my team seallows this post up ????
Please use it 🙂
Great article Ryan!
Thank you mark!
Thank you for sharing this, Ryan!
Love your writing style as well – condensed and straight to the point.
I hope you are having a nice time with your new endeavors 😉
Everything is Gucci Artem, thanks for the comment and the kind words!
As someone who does technical audits often, I completely agree with all of these. Great list and some even helpful to me for future use. Thanks!
Thank you Shannon! Glad we’re on the same page 🙂
Thanks for the awesome post Ryan. Bookmarked ! You won’t be writing on Webris anymore ?
Both – I am going to be a busy man :/
Excellent article Ryan! Definitely a great reference post. Always appreciate your writing.
Thank you Bryan!
Great stuff! Adding a handful of these to my agency. Thanks Ryan
You’re welcome Travis
Thank you nice article. I have a question keyword used with a variation or you can say pillar keyword us in content still effective? Some gurus saying this technique is expired now. what you say about it : )
Check out this post by Nick: https://ftf.agency/keyword-research-now/
Great stuff and it is a coincidence that I have discussed 11 points out of 14 with my team. I just came across this article right now. Love to hear more from this website IFTF.
Fantastic article Ryan, thanks for sharing!
I’ve been looking for a tool like Botify, that should save a lot of time.
Talking about crawl budget, what do you use to actually tell google “crawl these pages more often and ignore xxx”
Appreciate it!
Nemanja
Amazing post with some great nuggets, Ryan!
Da bomb! Nice bonus with Structured data. Always confuses me on how I can structure content and data into a Structured data schema. I’ll check out that plugin suggestion.
Thank you Ken!
Great article, thanks for sharing. Good stuff in there. I’m trying Deepcrawl at the moment, since it’s a new to me.
Good work 🙂
Kkasper
awesome. thanks for suggestions i fixed my seo issues, but i used website auditor by seo power suite.
now i fixed 90% of my errors, more to be fixed..
Thanks for the awesome guide..
Nice article as always Ryan. Great info about schema markup tools that I was not aware of.
Spot on man…great to see a guy writing to the point and without useless marketing gimmiks
Thanks Razvan! Glad you enjoyed it
Thanks a lot Ryan for sharing the awesome guide on Technical SEO issues. Currently we working a portal where many technical things gone wrong. Specially the redirection part. I noticed most our redirection done with 302 which is not good. Then we modified it to redirect with 301.
I have noticed another issue, in domain level most of the developer don’t use 301 redirection. They keep http://domain/index.html, https://domain/index.html, http://www.domain/index.html, https://www.domain/index.html. And the funny part is, all of them are exist and not redirect to a single one.
I got a question for you, in Google webmaster how many domain variation you prefer to include? I add 4 version they are http://domain, https://domain, http://www.domain and https://www.domain .
You can have GSC setup on all domains, but you should really only use the one that runs on the proper version of the site.
Many thanks Ryan, really useful.
You’re welcome!
Solid breakdown as always Ryan. Addressing these basics and advanced techniques are game changing for any website / business.
Great post!
Your vision about Technical SEO is awesome.
I would like to add One more thing here…
that is Broken Links.
It feels bad user Experience and heavily impacts SEO ranking.
Thanks, Ryan.
Thank for the great post Rayn. Everything you mentioned is spot on and well worth the time spend to read it. Thanks again.
Ryan, just stumbled upon this! Brilliant stuff. Definitely bookmarking the site 😉
You will see me more often from now on!
All the best and have a great day.
Thanks again for the great piece.
Great article, somethings that have been troubling me:) Thanks
Hey Ryan,
Thank you for the info, definitely going to study this one!
Hello, I have a real quick question, it is necessary to upload the XML also to the website. and if I did not, will it affect my website a lot? Thanks
Nice article! When auditing websites, I often find improper use of structured markup that’s riddled with warnings and errors. I love using Google’s Structured Data Testing Tool to check and correct these issues, but unfortunately, Google is deprecating their tool in favor of the rich results tester.
I talk about this in my latest post, but outside of Google’s Structured Data Testing Tool, Merkle’s Schema Tester is my second favorite for making sure my schema markup is set up properly: https://brandonlazovic.com/best-structured-data-testing-tools-for-schema-markup