SEO Only Works on Crawlable Websites. Here’s How.

SEO Only Works on Crawlable Websites. Here’s How.

Technical SEO isn't rocket science. With a good understanding of search engines, your website's ranking can jump.

The first blog in our series on technical SEO covers one of the most important functions of a search engine: crawling.

In addition to covering its process, we'll discuss the factors that affect your SEO, along with some tips for speeding up the process.

Note: Since about 92.4% of internet users have Google as their default browser, this series (crawling, indexing, and ranking), is going to be focused on this search engine.

    SEO Only Works on Crawlable Websites. Here’s How.

What Is Web Crawling and Why Is It Important for SEO?

I knew nothing about crawling at the beginning of my SEO journey. I just couldn’t wrap my head around it, probably because most content I found about it was filled with complicated and technical terms that I shouldn’t even known.

However, when I started my blog, I came to understand what turned out to be quite a simple concept (at least for us content writers).

So what is web crawling? 
It is the process of discovering and indexing pages on the internet.
If a URL (website/page/blog post/...) is not crawled, it can not be indexed, and thus any SEO tactic won’t help you.
So what is web crawling?

The Crawling Process

According to Siteefy: 252,000 new websites are created every day, 10,500 every hour, and 175 every minute. This is just the number of websites, not to mention the number of blog posts and pages. The question is: How does Google discover them?

Obviously, URLs don't appear in search results just by themselves. Google must first discover them. To do that, Google sends spiders (bots) to large, well-known sites (those that normally appear on the first search result page). Once they have collected links on these pages, the spiders proceed to crawl them. They then collect liked pages there and crawl them. This cycle goes on and on ...
The crawling process
The web crawling process
Once Google collects and stores millions of URLs, it then indexes them.

Before we go on, you might wonder how search engines decide what websites are to be “trusted”... Does Google trust every website appearing on the first results page?

It doesn't really work like that. The ranking of a brand-new website based on a ridiculously easy keyword does not constitute trustworthiness.
Consider the following metrics when determining a website's trustworthiness:

Domain Authority (DA)

Domain authority is a metric on a scale of 1 to 100 used to predict how well a website will rank. The scale is based on the quality and quantity of links to that website.

If websites were people, DA would be their fame: how many people follow them on social media, and who are these people?

Here are some examples using Mozbar:
  • Amazon
With a domain authority of 96/100, Amazon is one of the most powerful websites on the internet.
Amazon DA

  • Jumia
Jamia's Domain authority

Page Authority (PA)

Just like DA, a page authority metric on a scale of 1 to 100 is used to predict how well a webpage will rank. The scale is based on the quality and quantity of both internal and external links to that webpage.

Since both metrics are logarithmic, the higher a website gets on the scale, the harder it will be to progress. So it would be much easier to take your DA from 10 to 20 than from 60 to 62.

logarithmic DA

The 4 Factors That Make SEO Dependent on Web Crawling

There are 4 major factors that influence the crawling process:

1. Crawl Budget

Crawling costs Google money, which is why they have a budget for it. This leads us to an important conclusion: there is a crawl limit.

Although the crawl budget usually concerns huge websites with thousands of pages or blog posts, it can also help you have an insight into what goes on when your website grows.

As Google confirmed:
there is a crawl limit.

The crawl limit is determined by many factors, the main two are:
  • Crawl health: the performance of a website (page speed, mobile-friendliness, UX, …)
  • Crawl demand: the popularity of web pages and how often they’re searched for
To see all factors, you can read Google’s search central blog.

2. Site Structure

A site's structure is its architecture. Good websites have well-designed architecture, whereas “inferior” ones either have a terrible structure or don’t have one at all.

Whatever structure you choose, it must meet the following requirements:
  • Logical
  • Clear
  • Easy to navigate
  • Based on assigned priorities

So if your important pages are buried “somewhere” on your website, with no internal links pointing to them, and your categories are almost absent from the blog section, that’s a problem that you should fix as soon as possible.

Here are a few structures of websites you probably know:
  • Amazon
Amazon's website structure
  • Netflix
Netflix's website structure

How to assess your site’s structure?

Several tools can help you assess your site’s architecture like:

3. Broken Links

Also known as 404 errors, broken links are SEO’s enemy, and thus our enemy.

First, how do broken links crop up on our sites?

It’s not a coincidence, these errors appear for various reasons such as:
  • Changing permalinks without a redirect
  • Moving the external site
  • Moving or deleting content linked to (PDFs, videos, …)
  • Broken elements within the page’s code
  • Restrictions to outside access by the firewall

If for some reason, your site has 404 errors, it won’t only affect your SEO score, but also annoy your most loyal visitors.

SEOPressor did a Google poll on what annoys people the most when visiting a website, and here’s the result:
Broken links annoy users

How to assess broken links?

You can find broken links by using one of my favorite free SEO tools: Semrush

First, go to the management section, go to projects, and choose your project (in this example, I’ll show you The Marketing Recipe’s dashboard)
How to assess broken links SEMRush

How to assess broken links SEMRush

Go to the Site audit
SEMrush broken links

Click on Issues
SEMrush broken links

And if you have any internal broken links, they’ll appear. In our case, we have none :)
SEMrush broken links

NOTEyou need to upgrade to see external broken links.

How to fix broken links??

There are mainly 3 solutions for 404 errors:
  1. Removing the broken link
  2. Automatically directing users to a relevant page
  3. Creating a 404 page
  4. Creating a 404 page: creating a page to direct users will give them a good user experience, and allow bots to crawl your site faster.

Here’s an example of a 404 page on Disney’s site

4. Non-Crawlable Content

There have been rumors about JavaScrip circulating in the SEO community for the last decade. Although Google denied that using JavaScript could affect your site's crawlability, it seems as if it could.

The majority of SEO specialists, as well as most tools, recommend avoiding JavaScript as much as possible. You can use ScreamingFrog to identify pages that contain it and determine if it's possible to reduce it.

How to Get Google to Crawl Your Website Faster

Waiting for Google bots to randomly stumble upon a dofollow link to your brand-new website can take forever (unless you’ve got your own trusted sites). To speed up the crawling process, you can do 2 things:

1. Submit Your Sitemap to Google Search Console

A sitemap is a list of all your website pages. It comes in various formats:
  • XML
  • RSS / ATOM File
  • Plain Text file
You can find it by searching for:

If you happen to be on a Blogspot subdomain, you can use an XML sitemap generator.

Once you have your sitemap ready, you need to submit it to Google Search Console. Once set up, each time you add another page, your search engine will automatically detect new URLs and crawl them.

2. Get Backlinks from Trusted Websites

Another way to speed up the crawling process and your awaited SEO results is to get quality backlinks. For a new site, this method is quite difficult, but if you manage to get a few quality backlinks, you will certainly appear on those first search results!

Takeaways: Website Crawling & SEO

Let’s sum up the most important points:

  • Technical SEO is not rocket science (I had to say it again)
  • Google does not automatically detect new content
  • There are 4 factors to mind when speaking about web crawling: Crawl Budget, site structure, broken links, and non-crawlable content.
  • To shake up the process, you can either submit your sitemap to Google Search Console (which you should) or get as many quality backlinks as you can.

Yasmine Jedidi

If I'm not writing, I'm drinking tea! Apart from being an introverted tea lover, I am also an SEO content writer✍️, a freelancer, and a BBA student. It is my humble intention to use this blog to share my knowledge and experience in the field of marketing and SEO. Ever since I started The Marketing Recipe, it has turned into my secret addiction. Without skipping a beat, I continually think of ways to enhance your knowledge and benefit you. Your feedback on our content is greatly appreciated, so don't be shy, drop a comment and I'll make sure to answer you.

Post a Comment

Previous Post Next Post