Log File Analysis for Better SEO

Who is Omi Sido?

Omi Sido https://omisido.com/ is a seasoned international speaker and is known in the industry for his humour and ability to deliver actionable insights that audiences can immediately start using.

From SEO consulting with some of the world’s largest telecommunications and travel companies to managing in-house SEO at Daily Mail and Canon, Omi loves diving into complex data and finding the bright spots.

Slides: https://www.slideshare.net/omisido/log-file-analysis-for-better-seo

Transcript:

Hi, guys. Thank you for coming here today. My name is Omi Sido and as most of you know I am very active online and I love talking about Digital Marketing and SEO.

Server logs analysis

Today I’m gonna be talking about Server logs analysis and how understanding what the Googlebot does and what the Googlebot sees on your website can massively boost your digital marketing efforts.

I’m gonna start with a short story. A few months ago – this is a real story by the way – a few months ago a guy came to me and asked me to teach him the basics of SEO so he can collaborate with his SEO team. Of course, I said yes. So I gave him some books and some blogs to read and we agreed to meet again in two weeks. Hi, David. So in two weeks, in two weeks we went to Costa and of course, I asked the guy him about his SEO knowledge. His answer: Content is king and currently he is reading a book about content creation and semantic search.

While he was talking I wrote a short poem, an SEO poem on a piece of paper and after he stopped talking paused I simply asked him: Can go and upload this poem to the Internet, on the Internet. He was like confused obviously: Omi what do you mean? And I was like: Well, you’ve just told me that content is king. So upload this poem on the Internet and make me famous. I wanna be the King of the Internet. Of course again confused And he said something like “But Omi I need a website”. End of story.

A lot of people come to me. A lot of companies and tell me: Omi, our website is beautiful. Our pictures are amazing. We publish content on a regular basis but somehow we are not ranking well. Why?

I said it before and I would say it again: Yes, content is King. But every King needs a castle, a home to live in. Technical SEO is this castle and by analysing your server logs you basically know whether you’ve got a solid structure or not.

If every time when the Googlebot comes to your website it can’t understand the structure of your website if every time when the Googlebot comes to your website and ignores strategic sections of your website your content counts for nothing. Shall I repeat this one? Your content counts for nothing.

Sorry, I can see a lot of fo people tagging me.

The only way for the Googlebot to understand what’s on your website is to literary come and crawl all your pages. Now some of you may say ‘But what is a log file?’ This is a log file. I know it’s a bit confusing but in all honesty, you don’t have to be very technical unless you wanna be an SEO Geek to benefit from the data, from the information, from the wisdom coming from the server logs. From the server logs.

So to make all these very simple I will give you another story.

Six months ago a company came to me and asked me to analyse their website and give them some recommendations for improving their rankings.
Obviously, I started crawling their website and the first thing I saw was half a million pages but only roughly 20 000 of them active pages. You all know this website. I’m not gonna mention the name because of the NDA but you’ve been there at least once for the last one month.
Half a million pages but only twenty thousand of them active pages. By active pages, you know what I mean pages that get organic traffic.
So I’ve started analysing their website, I’ve started crawling it and of course analysing the server logs so this is what I see. OnCrawl.

154 orphan pages. Yes, 154 000 orphan pages. Some websites don’t even have 154 000 pages. Out of 154 000 pages only 3 000 of them active pages.

Now, what are orphan pages?

Chelsea: Omi, what’s an active page? How do you qualify that?
Getting organic visits.
3 000, remember this number.

So what are orphan pages?

Orphan pages are pages that are not linked from anywhere in your website structure.

This is the definition you see online.

What is my definition for orphan pages? Omi Sido’s definition for orphan pages. Stop hurting your SEO. Please.

How do we find Orphan pages?

The only way to find Orphan pages is to crawl, fully crawl a website. Take all the log file data, combine it together and analyse it. In this case out of 154 000 pages only 3 000 active pages. I had no choice but to literary delete all inactive pages. And I know it sounds a little bit harsh.
Then I continued analysing this website and I find that the bot is spending a lot of time, literary stuck in a section full of non-complaint pages instead of crawling the sections with complaint pages. Remember my what I said earlier about the strategic crawling of your website. I had no choice but to delete another big chunk of this website.

And then what I call duplicate URL crawling. By analysing this website I realised that the bot is spending a lot of its resources crawling pages with parameters even although they were properly canonicalized.

We had to literally reshuffle the whole navigation and stuff like that.
And now have a look at this picture. Yeah, this graph.
Pages crawled and not crawled by depth against SEO visits distribution by depth.

As I told you earlier you don’t have to be very technical to understand the importance of analysing your log files.
Have a look at this section of the graph. Only 49% are crawled. Yet, this section gives the most SEO organic visits.

By the way, I don’t like calling them organic. For me, they are just SEO visits.

But anyway.
Very strange for this website page depth five is giving more organic visits than those two. So I had a lot of conversations basically the idea was what’s gonna happen if I literary delete this group. What’s gonna happen if I force the bot to crawl this section more often and index more pages? What’s gonna happen if I actually combine those two without deleting this one or I combine those two? Notice this is only 19%. I hope you can see it from far. Nineteen percent.

You have to really think how you wanna spend your Crawl Budget.

We deleted, in all honesty, we’ve deleted more than 60% of this website went to the bin.

Just for clarity. These 60% any of these is unique content stuff like that. You are not suggesting

No, I am not suggesting that. Normally Orphan pages that are not visited by anybody even the bots are.

So ok. So to explain. Let’s go a little bit to Oprpahn pages. Normally those are development mistakes, expired product pages. Do we agree on this one? Ok, thanks.

So we’ve deleted. It’s ok. No, no of course yeah. The point I am trying to make and thank you very much because many people don’t actually know what Orphan pages are. You are absolutely right.

We’ve deleted more than 60%. More than 60% of this website went to the bin. Yet, six months down the line this client sells more products than ever. I didn’t say visits I said money. They sell more products than ever. Now that the bot is allowed to crawl the good pages more often resulting in more pages present in the SERPs and-and in a better position.

Some of you may say ‘Omi, this is a big website. They had a literary room for deleting pages’ and in fact, I have a lot of clients coming to me telling me ‘Omi, I don’t care about analysing log files because I only have 10-20 thousand pages’. So let me give you a quick example of a relatively small website.

This website was about to be migrated 6-7 months ago and I was asked to analyse it.

22 000 pages in the structure. 8 000 orphan pages. 8 thousand. After finding this one they nearly fired the whole Digital Marketing team. Thank God they didn’t. I’ve got more followers on LInkedIn.
23% are only bringing 3% of organic visits. Literary. Three percent of organic visits. Is it worth keeping those pages?
On the other side in the previous example – sorry I can’t find it now – in the previous example the Orphan pages were actually bringing 37% of organic visits. Why are you not linking to them?
Internally linking to them first so your customers can find them when they come on your website and second you can improve their SEO value so they bring even more visits in the future.

Guys, I hope I gave you, I’ve given you a good idea of how to.

Girl: I have a question.

Omi: Of course.

Girl: Have you done any analysis of incoming backlinks to those pages that you.

Omi: Yes, you have to do that. You have to do that.

Girl: How to do that all feed into the whole process?

Omi: With the example, I gave you there were no literary no backlinks. But you have to do that, you know.

Guys, I hope I’ve given you enough information but by all means, ask questions.