SEO Only Works on Crawlable Websites. Here’s How.

Technical SEO isn't rocket science. With a good understanding of search engines, your website's ranking can jump.

The first blog in our series on technical SEO covers one of the most important functions of a search engine: crawling.

In addition to covering its process, we'll discuss the factors that affect your SEO, along with some tips for speeding up the process.

Note: Since about 92.4% of internet users have Google as their default browser, this series (crawling, indexing, and ranking), is going to be focused on this search engine.

SEO Only Works on Crawlable Websites. Here’s How.

1. What Is Crawling
2. The Web Crawling Process
3. The 4 Factors
4. How to Speed Up Crawling

What Is Web Crawling and Why Is It Important for SEO?

I knew nothing about crawling at the beginning of my SEO journey. I just couldn’t wrap my head around it, probably because most content I found about it was filled with complicated and technical terms that I shouldn’t even known.

However, when I started my blog, I came to understand what turned out to be quite a simple concept (at least for us content writers).

So what is web crawling?

It is the process of discovering and indexing pages on the internet.

If a URL (website/page/blog post/...) is not crawled, it can not be indexed, and thus any SEO tactic won’t help you.

BONUS: Get The Only Keyword Research Template You'll Need for free.

The Crawling Process

According to Siteefy: 252,000 new websites are created every day, 10,500 every hour, and 175 every minute. This is just the number of websites, not to mention the number of blog posts and pages. The question is: How does Google discover them?

Obviously, URLs don't appear in search results just by themselves. Google must first discover them. To do that, Google sends spiders (bots) to large, well-known sites (those that normally appear on the first search result page). Once they have collected links on these pages, the spiders proceed to crawl them. They then collect liked pages there and crawl them. This cycle goes on and on ...

The web crawling process

Once Google collects and stores millions of URLs, it then indexes them.

Before we go on, you might wonder how search engines decide what websites are to be “trusted”... Does Google trust every website appearing on the first results page?

It doesn't really work like that. The ranking of a brand-new website based on a ridiculously easy keyword does not constitute trustworthiness.

Consider the following metrics when determining a website's trustworthiness:

Domain Authority (DA)

Domain authority is a metric on a scale of 1 to 100 used to predict how well a website will rank. The scale is based on the quality and quantity of links to that website.

If websites were people, DA would be their fame: how many people follow them on social media, and who are these people?

Here are some examples using Mozbar:

Amazon

With a domain authority of 96/100, Amazon is one of the most powerful websites on the internet.

Jumia

Page Authority (PA)

Just like DA, a page authority metric on a scale of 1 to 100 is used to predict how well a webpage will rank. The scale is based on the quality and quantity of both internal and external links to that webpage.

Since both metrics are logarithmic, the higher a website gets on the scale, the harder it will be to progress. So it would be much easier to take your DA from 10 to 20 than from 60 to 62.

The 4 Factors That Make SEO Dependent on Web Crawling

There are 4 major factors that influence the crawling process:

1. Crawl Budget

Crawling costs Google money, which is why they have a budget for it. This leads us to an important conclusion: there is a crawl limit.

Although the crawl budget usually concerns huge websites with thousands of pages or blog posts, it can also help you have an insight into what goes on when your website grows.

As Google confirmed:

The crawl limit is determined by many factors, the main two are:

Crawl health: the performance of a website (page speed, mobile-friendliness, UX, …)
Crawl demand: the popularity of web pages and how often they’re searched for

To see all factors, you can read Google’s search central blog.

2. Site Structure

A site's structure is its architecture. Good websites have well-designed architecture, whereas “inferior” ones either have a terrible structure or don’t have one at all.

Whatever structure you choose, it must meet the following requirements:

Logical
Clear
Easy to navigate
Based on assigned priorities

So if your important pages are buried “somewhere” on your website, with no internal links pointing to them, and your categories are almost absent from the blog section, that’s a problem that you should fix as soon as possible.

Here are a few structures of websites you probably know:

Amazon

Netflix

How to assess your site’s structure?

Several tools can help you assess your site’s architecture like:

Screaming Frog
Ahrefs
SEMrush
More free SEO tools

3. Broken Links

Also known as 404 errors, broken links are SEO’s enemy, and thus our enemy.

First, how do broken links crop up on our sites?

It’s not a coincidence, these errors appear for various reasons such as:

Changing permalinks without a redirect
Moving the external site
Moving or deleting content linked to (PDFs, videos, …)
Broken elements within the page’s code
Restrictions to outside access by the firewall

If for some reason, your site has 404 errors, it won’t only affect your SEO score, but also annoy your most loyal visitors.

SEOPressor did a Google poll on what annoys people the most when visiting a website, and here’s the result:

How to assess broken links?

You can find broken links by using one of my favorite free SEO tools: Semrush

First, go to the management section, go to projects, and choose your project (in this example, I’ll show you The Marketing Recipe’s dashboard)

Go to the Site audit

Click on Issues

And if you have any internal broken links, they’ll appear. In our case, we have none :)

NOTE: you need to upgrade to see external broken links.

How to fix broken links??

There are mainly 3 solutions for 404 errors:

Removing the broken link
Automatically directing users to a relevant page

Creating a 404 page: creating a page to direct users will give them a good user experience, and allow bots to crawl your site faster.

Here’s an example of a 404 page on Disney’s site

4. Non-Crawlable Content

There have been rumors about JavaScrip circulating in the SEO community for the last decade. Although Google denied that using JavaScript could affect your site's crawlability, it seems as if it could.

The majority of SEO specialists, as well as most tools, recommend avoiding JavaScript as much as possible. You can use ScreamingFrog to identify pages that contain it and determine if it's possible to reduce it.

How to Get Google to Crawl Your Website Faster

Waiting for Google bots to randomly stumble upon a dofollow link to your brand-new website can take forever (unless you’ve got your own trusted sites). To speed up the crawling process, you can do 2 things:

1. Submit Your Sitemap to Google Search Console

A sitemap is a list of all your website pages. It comes in various formats:

XML
RSS / ATOM File
Plain Text file

You can find it by searching for: https://yourwebsite.com/sitemap.xml

If you happen to be on a Blogspot subdomain, you can use an XML sitemap generator.

Once you have your sitemap ready, you need to submit it to Google Search Console. Once set up, each time you add another page, your search engine will automatically detect new URLs and crawl them.

2. Get Backlinks from Trusted Websites

Another way to speed up the crawling process and your awaited SEO results is to get quality backlinks. For a new site, this method is quite difficult, but if you manage to get a few quality backlinks, you will certainly appear on those first search results!

Takeaways: Website Crawling & SEO

Let’s sum up the most important points:

Technical SEO is not rocket science (I had to say it again)
Google does not automatically detect new content
There are 4 factors to mind when speaking about web crawling: Crawl Budget, site structure, broken links, and non-crawlable content.
To shake up the process, you can either submit your sitemap to Google Search Console (which you should) or get as many quality backlinks as you can.

SEO Only Works on Crawlable Websites. Here’s How.

Table Of Contents

What Is Web Crawling and Why Is It Important for SEO?

The Crawling Process

Domain Authority (DA)

Page Authority (PA)

The 4 Factors That Make SEO Dependent on Web Crawling

1. Crawl Budget

2. Site Structure

3. Broken Links

4. Non-Crawlable Content

How to Get Google to Crawl Your Website Faster

1. Submit Your Sitemap to Google Search Console

2. Get Backlinks from Trusted Websites

Takeaways: Website Crawling & SEO

Post a Comment

Try This Free Notion To-do List Template (Weekly)

Top Notion Templates

Our Favorite SEO Tools

Free Tools

Contact Form