SEO Testing 101 with Will Critchlow

Updated on

Topics: seotesting

48 min well spent

Will Critchlow, founder and CEO of Searchpilot and organizer of Search Love, talks about the ins and outs of SEO testing. We also cover how SEO changed over the last 15 years and how curiosity is the key to last a long time in the game.

Audio version

Timestamps

  • 0:00 Introduction
  • 0:50 How Search Pilot was born
  • 3:13 The basics steps of SEO experiments
  • 6:18 How to pick the right sample size and understand statistical significance
  • 9:04 The limitations of SEO testing
  • 13:21 How Searchpilot creates such great case studies
  • 20:53 The challenges of flawed SEO data
  • 23:15 How exactly Will + Team set SEO experiments up
  • 26:27 The biggest mistake people make with SEO experiments
  • 31:11 Finding good SEO experiment ideas
  • 34:16 How SEO has changed over the last 15 years
  • 40:12 A change of the Google guard
  • 45:54 What keeps Will in the game

Show notes

Transcript

Will: [00:00:00] We got pretty geeky there didn’t we, hopefully the audience is up to, uh,

[00:00:03] Kevin: [00:00:03] that’s exactly what I want. The white dog is exactly what I want. Hey friends, welcome back to another episode of the tech bond podcast. And this one I speak to will Critchlow. One of the most popular and prolific people in SEO will has been doing SEO for 15 years.

[00:00:17] He found a distilled in 2005 organizer of search love. And one of the people who have been doing SEO for the longest time, and this episode will Critchlow and I speak about anything. Related to SEO, AB testing, testing frameworks, and what it means to be 15 years in the SEO industry. So make sure you listen to the ends.

[00:00:37] Give me a thumbs up on YouTube and a five star rating wherever you listen to podcasts. Yeah.

[00:00:45] Three, two, one. Well, welcome to the show.

[00:00:50] Will: [00:00:50] Thanks for having me great to be here.

[00:00:52] Kevin: [00:00:52] It’s all my pleasure. Um, I want to start with the question about search pilot. Uh, how was search pilot born?

[00:01:00] Will: [00:01:00] So search palette got started originally as a part of R and D at distilled. So distilled was the company that I founded with Duncan Morris in 2005.

[00:01:08] So SEO agency. Initially. And then in the last few years we’ve been building, uh, an R and D capability. And out of that group, what was originally, we called the distilled ODN optimization delivery network. And it had become a, yes, it started as an R D project. But then, you know, can we do this? Is it valuable?

[00:01:27] Is it a good thing and became a business unit? So we ended up with it having its own P and L internally, uh, within distilled and its own revenue revenues, own its own customers. And then earlier this year, so when January, 2020, we spun that business unit out as its own independent company and in the process rebranded it as such, so sort of by pilot and, uh, The, uh, as part of that transaction, um, we actually sold the rest of the business.

[00:01:54] So the consulting and conferences, business to brain labs.

[00:01:58] Kevin: [00:01:58] So experiments have become absolutely crucial nowaday nowadays, I would say to understand really how SEO works because ranking set becomes so much more fluid. Everything is so much more, um, uh, customized to the actual vertical you operate in, or the keyword.

[00:02:13] I think Google gets really good at not applying the same. Raking signals to the same degree, to every kind of keyword and ranking or category. So search pilot basically is a solution for you to run your own SEO experiments in house. Is that fair to say?

[00:02:28] Will: [00:02:28] Yes, that’s right. We, we, we saw that some of the biggest tech companies in the world were building this capability themselves.

[00:02:34] And what we’ve tried to do is build a platform so that, uh, you know, the, the, a more mass market approach that the more companies can do this, and many more websites can, can run their own. Uh, their own tests and their own experiments, and we also help them do that. So in some cases it’s kind of self service.

[00:02:50] So the, you know, the advanced SEO teams, in-house, it’s, all of our customers are running their own tests completely. And then we also have a professional services team at our end, um, building on our years of consulting experience. Um, we have folks, uh, specialists in-house at our end, who are helping some of our customers, uh, run tests.

[00:03:09] And in some cases, actually just building and designing tests for them.

[00:03:13] Kevin: [00:03:13] Can you walk us through the very basic steps of how to think about SEO experiments and how to set them up.

[00:03:20] Will: [00:03:20] So I think a lot of people are probably familiar with the idea of testing. Yeah. AB testing, as it applies in conversion rate optimization or user experience testing, where we take the audience and separate the audience into, into groups and show some people, one version, and some people are different versions and see how they convert and so forth.

[00:03:38] The idea of SEO AB testing is somewhat similar. Uh, and, and certainly relies on some of the same statistics, uh, and kind of mathematical approach. But the key insight we have to realize is that there’s essentially only, uh, you know, if we’re testing for Google’s preferences, only one. Uh, Google. And so we can’t separate our audience in the same way.

[00:03:57] So what we do is we separate pages. So we, instead of separating the audience and making changes where we separate the, a group of similar pages. And so the kind of testing we run kind of SEO, AB testing, we run typically runs on a site section. So rather than just on an individual page. And so we take a site section of similar pages, think, for example, a group of product pages on an e-commerce.

[00:04:17] Website is a classic example and we, um, we, we make changes to some of the pages. Keep other pages unchanged and those changes are for everyone. So there is no cloaking, this is not being shown only to Google or any of these kinds of things. Um, these changes are made to some pages of the pens that kept as the control.

[00:04:36] And then we’d compare the performance of those, uh, of those pages with some of that statistical techniques and looking at their history and all those kinds of things. And that takes account of all kinds of other confounding variables, like, uh, seasonality. Competitor changes, Google algorithm updates, you know, all of these other things and enables us to get to a scientific, uh, evaluation of whether the weather, any change in performance is as a result of the changes that we made.

[00:05:03] To those pages and that’s kind of the basic fundamentals we get them into. It’s more advanced details where we start thinking about things like full funnel testing, which is where we consider, um, not only the SEO performance of those pages, but also the conversion rate at the same time. And, uh, this is kind of my favorite way of thinking about this stuff where we’re really trying to get to the root of, is there any conflict between user experience and Google preferences, search engine preferences, and to the extent that there is.

[00:05:31] How do we thread that needle? Right? How do we find the best possible combination of user experience in search performance? And, uh, that’s what I find kind of really exciting. And, and the great thing about running full funnel tests is that they combine the test. I took, I’ve talked about with some cookie.

[00:05:47] Uh, technology, which means that any individual user sees a consistent site-wide experience. So depending whether they came in on a controlled page or a Varian page, they then see a consistent site-wide experience. So, uh, it’s actually kind of the best version of SEO testing. If you, like

[00:06:01] Kevin: [00:06:01] I mentioned the point of, um, statistical aspects or just, uh, the, um, statistical component of, um, SEL testing, which I think is something that, um, needs a little bit of attention.

[00:06:12] And, uh, I think there’s also something that, that some SEOs. Um, shy away from, so can you tell us a little bit about how, um, search pilot helps SEOs pick the right sample size, understand when it has this statistically significant whether the result is statistically?

[00:06:30] Will: [00:06:30] Yeah, absolutely. So, so we focus on, um, uh, first of all, building the.

[00:06:37] The test that is most likely to be, um, significant. And so the way we do that is that at the outset. So I mentioned that we test on a site section. What we then do is we have to obviously pick which pages in that site section, before we start, which pages are going to be the control on which pages are going to be the, uh, the variant pages.

[00:06:52] And we do that with a former stratified sampling. So where we, what we’re trying to do is essentially pick two groups of pages that are statistically similar. At the outset before we start as possible. So they have the same kinds of, um, metrics in terms of, uh, traffic levels in terms of variance in traffic, uh, in terms of seasonality, all of these kinds of things, because that helps us identify, uh, you know, they don’t change his art as a result of our changes.

[00:07:20] And it’s really important because if you just randomly. Um, distribute a large, um, a large site section. You can run into kind of troubles because the, um, the kind of power law nature of the web means that some pages will dominate, dominate that sample. And so it’s kind of important to stratify it and try and try and make sure that you start with, um, groups that are as similar as possible.

[00:07:40] And so then we, uh, we run a form of, um, it’s based on some neural network technology and it’s a, it’s a fundamentally kind of Beijing approach where, where we’re kind of saying we’re able to look. Uh, look regularly and see how the test is going and make the call about whether it’s reached significance or not, which is not the case with certain kinds of, uh, traditional statistical approaches, where you just have to set an end date and you need to wait until then to, to make the call.

[00:08:04] And so again, so our system spits out a current, um, uplift confidence interval. And, um, and so essentially when that competence interval gets above zero by the entire competence interval, Is, um, it is positive. We can then call that test with a physical confidence and say, you know, w we believe that there’s X percent chance that this is a positive uplift where they’re most likely uplift of, um, how many sessions, uh, you know, a day or a month.

[00:08:33] It’s this is the kind of tough area to get into because the actual, what we tried to do is take, take some really quite complicated mathematical statistical approaches. And make the output as simple as possible and make it so that yes, obviously there’s complexity under the hood, but how can we make it so that what we’re seeing in the, in the analysis is a chart that is, uh, easy to interpret works in the way that you’ve kind of intuitively expect and, um, is, uh, as accessible as possible to it to a broad range, a broad audience.

[00:09:04] Kevin: [00:09:04] Can you talk a little bit about the limitations of, um, SEO testing and specifically who can test and who maybe isn’t ready, because the way that I understand it is that the smaller, the impact of a change or a treatment that you make, the more traffic you basically need. Right? So if you make some change or treatment to a very impactful.

[00:09:25] Uh, ranking signals say the meta title, then you might not need as much traffic or as large of proper sample size, but obviously there is a certain sweet spot. And so I’m just curious about your take and your experience about like who’s really eligible for SEO testing, who has too, and maybe who, who might not be ready.

[00:09:44] Will: [00:09:44] For sure. So there’s two key variables in my mind, which is one is the one you identified about, um, about traffic levels and yes, of course, uh, you know, the, um, the bigger a site in traffic to the site section that you’re talking about, the, the more subtle you can, a little subtle and effective you can detect.

[00:10:03] And, and that’s true of any kind of testing. I mean, that’s the same with conversion rate testing or, or any of the rest of it where, you know, Google might be testing or Amazon might be testing subtle variations in color that would. Just not be detectable on a, um, on a normal sized, uh, website. So that’s one element, but that’s not unique to SEO testing.

[00:10:19] I think the thing that is more unique or more specific to SEO testing rather is that, um, we do rely on there being a certain size of website in terms of number of pages as well. So the, the kind of testing that we’re running, doesn’t it doesn’t allow you to test, for example, a home page. So it’s fundamentally not a good approach or not even a possible approach for something like a small, uh, B2B SAS website.

[00:10:44] Right? In fact, certain pilot’s own website is not amenable to, uh, to the kind of testing that we do, because we simply don’t have hundreds of thousands of pages with a similar layout in the page template, that kind of stuff. So I think either if websites don’t have those pages or if their performance is dominated by a handful of pages homepage, uh, again, in B2B SAS, for example, your conversions may come from two smaller sets of pages to enable, uh, sensible testing.

[00:11:15] Um, our kind of rule of thumb is having a thousand organic sessions a day to the site section that you want to test on. And at that level, we can, uh, we can detect, you know, multiple percentage point updates, alerts, and so forth. So, you know, w w we’d probably not able to detect a 0.1% uplift, but nor do we need it to be at 25% of that before we can spot it, if that makes sense.

[00:11:42] So I think you can kind of work backwards from there and say, well, um, is it significant to the business to get a couple of percent uplift on this site section? Does it get at least a thousand organic visits a day, if both of those things are true, then it’s worth, uh, it’s worth running Tesla that my kind of stepping back from it.

[00:11:59] It’s like, so obviously that’s, that’s the way we’re going at such. Pardon? That’s the way that our SEO AB testing methodology works. I’m kind of, obviously you’re also interested in the broader ecosystem and how, um, how we as SEOs think about it, how we think about how Google works, how we, um, how we also, we make this stuff accessible, but like the insights that we find, how we make those accessible.

[00:12:18] To smaller businesses or smaller websites as well. And I think, I think there are things they can do. I think if it is sometimes worth doing before and after tests on individual pages, I still think it’s worth the data drift, even though you can’t perfectly capture seasonality, competitor updates, you know, Google impact that the Google algorithm updates, it’s still, it’s still worth being data driven.

[00:12:43] Uh, about that stuff. And then the other thing, and this is the area where I guess certain pilot is trying to help, although it’s not a kind of commercial mission is we’re trying to publish a lot of this data. So we’re trying to say, here’s what we’re finding when we’re running these tests on massive websites.

[00:12:59] And, you know, as you said at the onset, it, you can’t just take a result. And kind of naively assume that that’s how Google would treat a page on a tiny website, for example, but we hope as we build that library up, we have to show you things that often work things that very rarely work and kind of nudge the whole industry in a good direction towards the most beneficial changes.

[00:13:21] Kevin: [00:13:21] How about these case studies for a moment, because since you started publishing them this year, I’ve read every single one. And I think there, there are some of the most valuable SEO content out there right now. Absolutely. They’re amazing. I think there’s exactly, um, that’s, that’s the most valuable stuff out there, period.

[00:13:38] You know, it’s sharing experiments and I think. In the earlier days of SEO, there was a lot of sharing, kind of anecdotal observations going on and dead was super useful. And then Google became a lot more, a lot smarter or a bit more complex to understand. And then that went away and I feel like now it’s coming back with exactly those kind of case studies.

[00:13:58] What is kind of the, uh, what was the test that surprised you the most? So maybe the case study that surprised you the most, where, um, the result may be went against your understanding of things or against your

[00:14:08] Will: [00:14:08] intuition. Well, I mean, there’ve been a lot that have gone against intuition, which I guess is a good thing show.

[00:14:14] It shows that it is actually worth, uh, with running these tests. And so the, the biggest surprise for me, although it’s not the most, I think there’s two different ways of looking at this, the biggest surprise and the one that I’ve learned the most from, I guess. So I think the biggest surprise, uh, was so we’d been able to, in some cases, and we’re gonna be talking more about this.

[00:14:31] I think we’ve got some case studies coming up where we have tested entire. Uh, new templates, entire site redesigns. So rather than just tweaking a particular XML element or, uh, you know, adding or removing a particular piece of content, we have, um, been able to test them entirely different page. And, uh, obviously a group of pages, as I said, within a site section, and that’s thrown up some really interesting results.

[00:14:55] It’s obviously not as simple to, to draw lessons from it because so much changes at once. It’s not controlled in the sense of like, you know, we’re just making this little tweak, let’s see what that does. It’s kind of saying, Oh, well we have an entirely different version. If this outperformance would, we don’t necessarily know which of these many things it was, but probably the most surprising for me was where we ran a test like that.

[00:15:15] It moved from. Pure HTML to a, um, client side, rendered react, uh, templates and performance, improved organic set of performance improved. Now we’ve run plenty of other tests with the highlighted problems with JavaScript. Rendering and indexing and so forth. And my, my hypothesis, my suspicion is that this is not, that react was better.

[00:15:39] This is other things about the design were significantly better than the old version in particular, the actual final rendered Dom was a much cleaner HTML. Cleaner and faster HTML template. So I suspect it’s there, but, but nonetheless, you know, you’d never have got me betting. Oh yeah. The react version will win.

[00:15:57] Um, before that, before that test, uh, which runs, that’s really the most surprising, I think in terms of what we’ve ones where we’ve learned the most, it’s probably been, um, a repeated sequence of. Just how hard it is to write better title tags. Like, it sounds so simple, but, uh, you know, and I feel like, I mean, I’m, I’m sure you’ve been in the same boat.

[00:16:17] I feel like everybody who’s been in SEO for any period of time has made recommendations to clients or to, uh, you know, if you work in house to do your own website, I’ve just like, Hey, you know, we should update, we’ve done this keyword research. We should update these titles. And, you know, it’s a classic agency recommendation, right.

[00:16:32] You know, here’s a, here’s a keyword, here’s a bunch of keyword research. Uh, here’s a bunch of pages. We’re just gonna do the thing, right. Just simple now. Right. It’s really not simple. And to actually outperform the existing, um, title tags and worse than that, we’ve, they’ve been some of our biggest negative tests.

[00:16:51] So some of our worst outcomes have been. Apparently sensible title, tag updates, you know, like we do some keyword research, you find a, uh, a way of phrasing something that seems better. You roll it out to a template and minus 20%, 25% organic traffic and in some cases. And, uh, so yeah, my colleague Emily pots to get the whole presentation about half the presentation was just showing cases where essentially very, very experienced SEO consultants on our team.

[00:17:21] Well, but just failing to write title tags that were better than the ones that were already there. And so that’s the kind of embarrassing one, I think, where, uh, you know, people with 15 years of SEO experience and I should dread to think how many times I’ve written in a recommendation that we should incorporate keyword research into our title tags and.

[00:17:41] Yeah,

[00:17:42] Kevin: [00:17:42] yeah, 100%. Um, it’s funny because, um, we, we made the same observations. I made the same observations, uh, many times at different companies. Um, first of all, it’s hard to write good title tech. Second of all, the title that you think is best as is not always the best from Google’s or the user perspective.

[00:17:58] And third of all, there’s way more potential entitled texts. And most people think, I think, especially with these scalable sites that you alluded to, um, I don’t think they make. Uh, they, they use half of the potential that most huddle taxpayer. And it’s very interesting because one very, very big website out there that I’m not going to name actually goes against all known perimeters of Optima title texts.

[00:18:21] Their title tags are a hundred characters, long perform, outperform, anything else. They tested that systemic business medically. Um, and so I think, I think it kind of goes back to it to a point that I’m very bullish on, which is humbleness in the SEO industry. And I think, I honestly think that, or I would love to see, um, um, experiment where not kind of the actual, like, That’s your experiment, but, um, real life experiment where SEOs would put money behind their bets, right?

[00:18:48] Like how would SEO would make decisions if they would have to bet a thousand dollars that their recommendation was the best one? I don’t think most people would. I think, you know, and I think that’s, that’s kind of like a false arrogance or like a bias that. A lot of SEOs suffer from, um, I certainly did where you think that this is certainly the best recommendation, but these tests allow us to, to spot check or to, to falsify that, right.

[00:19:11] That we don’t always know the best answer and that instead of trying to have a hard stance on a specific recommendation, we should have hard stance on testing certain things and then letting the data speak for itself more or less.

[00:19:23] Will: [00:19:23] Yeah, for sure. And I think, I don’t think it’s actually unique to title tax here.

[00:19:26] I think what we’re seeing with the title tag tests is just. Um, as you said, talking Texas so powerful that there’s such a, a single big thing where we can see that Delta so easily, but this is there in every recommendation. I totally agree with you. And I think it’s not only the, um, That kind of, I guess the arrogance of thinking you’ve got it.

[00:19:46] Right. But it’s just that, it’s just the depth of complexity is so hidden from us. And I think it’s hidden from us by necessity because we can’t just get our heads around the scale of the data it’s hidden from us because the data just simply doesn’t exist or at least not in a form that we can access it.

[00:20:00] Uh, I, and I think that the classic one for me is just. It’s really hard to get your head around how long the long tail is. That’s just continually mindblowing. And the, the thing that I think that the manifestation of that is, it’s always inevitably type thing. We’re just human. We go and look at individual rankings.

[00:20:22] We go and look at some head terms. We go and look at, uh, how our rankings have changed in our rank tracker and without re without, even if we kind of. Intellectually. We know that this is a tiny, tiny fraction. If this is the observable portion of the universe and the universe is, is immense. It’s really hard to get our heads around that, that scale.

[00:20:38] And I think that’s the other thing that’s happening here quite often is that, especially when you do keyword research and stuff is yeah. You’re optimizing for the ones you can think of without. Uh, taking account of the, the extreme variation that, that actually exists across the university.

[00:20:53] Kevin: [00:20:53] Absolutely. And it starts with having very flawed data itself, right. That the ideal of search volume, very incomplete, very narrow, uh, view on what’s actually going on. How often has it happened that a business actually started creating content just based on. Intuition editorial, you know, guidance or journalistic approaches and without even looking at search volume, and then they do a really good job, um, at ranking for SEO.

[00:21:19] How often has it happened that you optimize for a keyword that has low, very low search volume, then turns out the page ranks for all sorts of, um, keyword variations, that semantic keywords. There’s actually so much more traffic that’s coming through.

[00:21:31] Will: [00:21:31] Yup. And that’s why we, um, cause there’s also an interesting question when it comes to SEO testing.

[00:21:37] In a sense, if you’re trying to discover ranking factors, it makes more sense to look at rankings. Right. So we made these changes. Did these rankings change that there is some validity to that? Unfortunately. So our approach is much more to look at. We look at rankings to explain things, but in terms of picking winners, we look at organic traffic.

[00:21:55] And the reason for that is because you can only ever look at a small subset of rankings, whereas organic traffic captures the entire, uh, the entire long tail. There’s a lot of complexity in there. And one of the things that we would love to be able to do as we, as we evolve our product is think about.

[00:22:11] The the, so rankings can be a leading predictor of success, right? So if you move from ranking position 48 to position 45, you’re probably not gonna be able to detect that in your organic traffic. In theory, looking at rankings should be able to tell us that something is a good idea, even if it’s so marginal that we can’t.

[00:22:30] Um, we can’t detect it in the, in the traffic yet. And I’d love to build something that, that kind of captures all those things together, but, uh, at least at the current state of play, yeah. There’s no way of getting to the, uh, to the extreme long tail that you’d need to, uh, to, to, to be able to be confident that, that, yeah, that 48 moving to a 45 wasn’t offset by a different, you know, 30, 35 movies or 39 or whatever it might be.

[00:22:53] So, um, Yeah, that that’s, that’s why we take that. Right.

[00:22:56] Kevin: [00:22:56] And I love that. I think in my mind, I’m super passionate about this. This is the, this is the kind of modern way to think about SEO in my mind. Right. Uh, and not a lot of people take that mindset out there and I love that you push it forward. It’s I think it’s absolutely crucial for, uh, to be able to keep track with SEO and actually make good calls.

[00:23:13] And instead advanced the whole SEO community. So regarding the actual experiments, how do you set those up? I mean, one side of that is obviously you work with some of your closest clients, I assume. And you, you know, they give you, um, uh, the permission to publish these, these case studies. But how does that work?

[00:23:29] I mean, you haven’t, you haven’t an onboarding team or, or, or a services team that helps clients run these experiments. Right. Um, is, do you then talk to them and you’re like, Hey, could we maybe publish some of that on our blog? Like how, how does all of that work.

[00:23:42] Will: [00:23:42] Yeah. So we’re, um, we have different kind of levels of, of, of permission needed as we go through different, uh, levels of detail.

[00:23:49] Right. And so we’re, it ranges from, there are some where we want to be able to put the, uh, the customer’s name on it and describe the exact change to the actual HTML template. All the way through to the other extreme of saying, Hey, you know, we ran a test that focused on a particular kind of structure of data and, you know, it was positive or whatever, you know, and that’s this, there’s a big spectrum in there.

[00:24:12] And so we have everything ranging from, you know, some customers who say, uh, you know, you, you can, you can write up the results, but just don’t share our traffic data and don’t share out. Um, you know, our brand name, uh, through to others that are happy to actually do co-branded full on case studies. Th there’s quite quite a range there.

[00:24:30] And so yes, it comes out of, uh, ultimately out of that professional services team who are, uh, at the moment are involved in a lot of those tests, whether it’s it in, um, all the way from ideating them through to. Uh, analyzing them or in some cases just supporting, uh, you know, the customer side teams that are doing that.

[00:24:47] But, uh, yeah, I it’s one of the big flywheels that we’re trying to push is try and say, actually we want to, um, wait for most of our full service customers, the kind of bigger customers. We are, we’re essentially writing everything off as a case study, even if it’s only for internal use, even certainly for that, that customer.

[00:25:04] Um, and then they can use it in their internal reporting. They can use it in their, um, uh, that, you know, that that bought decks or whatever they might need. And so we’re trying to produce these. All the time. And so then it’s just a question of saying, well, which ones are interesting, which ones have a good story behind them?

[00:25:18] Which ones do we think the community can learn from? And, um, and then yeah, either go to the extreme anonymized end or the extreme, um, uh, you know, kind of client sign off and to get the logo and all the

[00:25:30] Kevin: [00:25:30] rest. I love those. Yeah. Please, please keep them coming for a long time because I,

[00:25:35] Will: [00:25:35] we, uh, we, we plan to, yeah, definitely.

[00:25:37] For sure. We, we see this as the, uh, the kind of the arm. Unique content marketing advantage. Right? So the, it, this is why it was a flight where like, we feel that the more we can write these things up and have them kick around internally at our customer site externally on the internet. The, the more people get onside with that way of doing this.

[00:25:58] Uh there’s uh, our thinking about how you analyze, um, SEO performance and yeah, it’s, it’s good for us, we think. And I’d like you to say I it’s also the resource I wish had existed before we had this tool. You know, so I would have subscribed to this email list in an instant, uh, when I was running an agency.

[00:26:16] And so once again, I feel like I, I can, uh, I mean, nothing, the target market that I am. I know there’s, we’re producing something of value there,

[00:26:24] Kevin: [00:26:24] for sure, for sure. Um, certainly, um, what would you say is the. The biggest thing that people are missing when it comes to setting those experiments up, like, what is the thing that your services team has to help out with the most or that, uh, that the biggest mistake that you see companies maybe not using sort of pilot?

[00:26:42] Will: [00:26:42] I don’t, I don’t know if there’s a particular thing. I would point to that our team has helped our customers with that. That tends to be, I think their biggest value out there is thinking of great things to test. Actually, it’s kind of more on the SEO side than it is on the testing. Side. So it’s trying to fight.

[00:27:00] And this is for my friends on the conversion rate optimization world. This is very similar to anybody running tests anywhere. It’s kind of, you have to encourage people to think a bit bigger, to think very creatively, to think differently when you’re making a test recommendation than a flat out recommendation.

[00:27:15] Right? If you’re, if you’re just saying, we think this is better, you should do this. The level of certainty required less level of confidence required is very different, too. Here’s a crazy idea. Why don’t we just see if this is better? And so that is some of those mindsets and that coaching, that that’s probably where we see the biggest, um, the biggest value add.

[00:27:35] I think when you look outside of just our tests and I look at the, kind of the broader SEO industry and other folks who are running these tests, I think the biggest thing is just a challenges on the, on the statistical. Side of it. So, um, either not even not running controlled tests, so-so so what people call tests are in fact just changes.

[00:27:57] Right. And I think I said earlier with this water side, I don’t want to knock that. I think it’s a, it’s a very valuable thing to do to make a change and then look and see what happens. Like I don’t want to kind of, uh, uh, I don’t wanna say it’s a bad idea. But I also think it’s important that when you, especially if you’re publishing the results of that, especially if you’re trying to educate the audience and the community about what you did and whether it was a good idea, there’s a lot of caveats.

[00:28:22] I have to go with that. And actually we’re very sensitive to this in our case studies as well. You’ll notice a lot of us talking about why we think this might have been the case in this specific. Case rather than necessarily trying to draw, you know, always do this lessons that, that are very broad brush stroke.

[00:28:40] And so I guess if I had some others, one thing that’s the thing is people try and draw lessons that are broader than the data supports. Um, and, and really what we’ve found is tests are valid. In the context, they were run on the site, they would run on that kind of stuff. And yes, we can try and draw broader recommendations from that, but we can’t necessarily draw, um, kind of firms scientific conclusions that will hold in in other domains.

[00:29:06] And so, yeah, I think that’s what I want to, um, I want to encourage more of it is just that level of critical thinking. And I think we. It’s not just in testing across SEO, generally just stepping back. And when somebody, th there’s a lot of misinformation out there, whether it’s on Twitter, in forums, on Facebook, on Reddit, what conferences are coming from Google.

[00:29:27] It’s like there’s many, many places where you can find things that are not quite true. What I try to coach our team and I would encourage everyone is just have that apply a level of, um, uh, your own critical thought to that. Try and understand. Why could that be true? Why might that not be true? Go back to first principles, you know, is there an information retrieval reason why that should be true?

[00:29:48] And we actually have guidelines internally about how we guide our customers to make different business decisions, uh, even under the same statistical answer, depending on the strength of our hypothesis and the what supports it. Right? So if we have a very, very strong reason to think something might be true, And there’s, uh, you know, it’s, it’s information retrieval, good practice and so on and so forth.

[00:30:13] And then you get a somewhat neutral or you got a neutral result. You may still choose to roll that out because it’s good hygiene. It’s good for it. In general. In principle, if you do this repeatedly, it’s going to be good for your website. In contrast, if you get a, uh, a strong, um, you know, you’ve got a statistical confidence results, that is a complete surprise that goes against your intuition.

[00:30:36] That’s a great time to step back and say, uh, a like trying to understand it, be, look for the confounding reasons why it might not be true, but see also, you know, potentially we run it or run a variation of it or sense, check it and try and make sure that, um, that it holds together as you, um, because ultimately what we’re trying to drive is business results.

[00:30:54] Not just. It’s not just exciting headlines on case studies

[00:30:58] Kevin: [00:30:58] preach. Pretty sure it was so much, so much truth in that statement. So thank you for that. And I want to talk about your, your, um, 15 years of experience in SEO, but I want to, um, allow myself one that’s questioned about this, uh, about SEO testing.

[00:31:11] You mentioned that a lot of people have a hard time actually finding the thing to test. How do you best think about identifying what as your experiments to actually

[00:31:22] Will: [00:31:22] run. I think we try to draw from multiple different pots of ideas. So this is a, this is also a great example of diverse teams outperforming.

[00:31:35] Um, individuals or, or emotionless teams get ideas from different places, get ideas from not just your SEO’s, but your content people and your technical people. And, uh, you know, a wide range of inputs can be really, uh, can be really valuable. Yeah. So we will, we will look from, you know, the kind of classic best practice, technical audit type recommended recommendations, you know, kind of checklist type stuff.

[00:32:00] We’ll look at, uh, kind of contents. Driven ideas, particularly search intent, focused ideas. So what are these pages for and how do they, are they matching that intent as closely as they could. We’ll look for kind of the more out there ideas, you know, if we’re going to do something different, what would that look like?

[00:32:23] And we’ll also do quite a lot of competitor type research and thinking that’s that’s one that I particularly value is when. Yeah, you, you, you put together a presentation that that shows, Hey, uh, you know, across this, this kind of, um, area of search, that’s three broad attacks, right? So thinking, you know, in travel, there’s like the OTAs, there’s the, uh, individual, um, whatever it be airlines or, or, or hotels or, or whatever else.

[00:32:50] Um, and there’s the media. Companies and each of them are producing different kinds of content to satisfy the same kind of intent in some cases, or at least the same kind of keywords. And sometimes those keywords have multiple intents. Sometimes people are doing more research, sometimes they’re more commercial, et cetera.

[00:33:04] And, um, Yeah, so it can be fascinating. So what can we learn from our search competitors who are not our direct competitors? So, you know, if you’re an OTA, what can we learn from the operators or the, uh, the media companies? And sometimes it can be worth saying, what can we learn from our direct competitors?

[00:33:21] Is that, are we missing something? You know, you mentioned the site you were talking about with the extremely long title tags, you know? Well, if you see one of those in your industry performing really well, that’s a, that’s a kind of classic place to get an idea. So, yeah, I think that there is a lot of art as well as science in this and one of the things, but one of the things that we try and do is, and I would recommend organizations that are trying to get good at this is build a swipe file as well.

[00:33:45] So, so kind of documentation is your friend build a list of things you tested on other pages, kinds of tests that are working for you test that you want to run. And, you know, we often have a kind of combination of. I think the things that we wish we knew he was a good Google, at least is FAQ schema. What we want to know is that, is that likely to be a good thing on your website or not?

[00:34:08] So, so that, that kind of funnel of ideas can come from, come from any

[00:34:12] Kevin: [00:34:12] places. Amazing. Thank you so much for that. Um, now touching on your. Longstanding experience in SEO. I think it’s it’s over 15, 15 years. Um, by now, um, you recently wrote an article on MAs, uh, titled 15 years in SEO was a long time. And, um, just for the I’m sure most listeners have actually read that article, but if you had to.

[00:34:38] Point out what some of the biggest changes are in the last 15 years. And there, there are many, uh, but what, what are the ones that come to mind first? Like how do you think that the industry, the craft and the art has changed the most over the last 15 years?

[00:34:54] Will: [00:34:54] So what, well, a lot has changed. There’s a surprising amount that is recognizable as similar, right?

[00:35:01] Like that. I think that I’m constantly amazed by both bets. You know, the, the, the thing that keeps me engaged and interested in, in working in this industry for this long is the change in is the, um, the curiosity that we can delve into, you know, what’s new and what works and what works, what doesn’t and so forth.

[00:35:19] But on the flip side, You know, I probably gave a presentation in, I don’t know, in 2006 or 2007, that said, um, you know, they talked about search intent and whether Google can crawl and index your site effectively, and whether, uh, you know, you were demonstrating the right authority, uh, if you had those things, then you’re set up pretty well.

[00:35:39] Right. And then, you know, that, that that’s true in 2020 as well. So it, it fell in eras for me. So for me, there was that kind of. Early two thousands, Google coming from nowhere, dominating the other search engines that used to exist before that. And you read that was all about page rank, right? That, that was the innovation page rank was the innovation and meant the links was so incredibly powerful.

[00:36:06] And that’s that, that for me was that first era then. The, the kind of reaction to that I think was, was realizing that actually the unique value that page rank broke it wasn’t solely about ranking things better. It was about the fact that they could rank, um, documents well on a much, much bigger web corporates than anyone else could.

[00:36:26] Right. So if you think about like a human curated search engine works incredibly well on a, you know, a web that is a hundred pages, right? If there’s only a hundred pages, ask a human, but when you get up to web. Google. That was where Google’s advantage really Shaun and that’s where paid rank really worked.

[00:36:43] And what the next realization I think was Google. And we didn’t see this at the time, but as you look back on history, Google invested everything they had in making the web bigger. Right. They, they literally funded the creation of a lot of the web through AdSense. Or they were paying people to write long tail content.

[00:37:02] Uh, so make the web bigger and get better at indexing it than anybody else. So Google built a bigger index than anybody else had. And broadly speaking pumps there, you know, put their revenue back into the engine that made the web bigger again. And that led to, uh, the incredible era of, I would say 2007 to 2010 of literally, if you have a database that doesn’t exist on the internet, Put it on the internet.

[00:37:31] And then again, next question, Mark profit, right? Like, or find a way to create a database that didn’t exist before. And that it’s that era that produced everything from TripAdvisor to Zillow in the U S Zoopla in the UK. Um, all of those kinds of people who were creating, essentially creating or publishing databases that had never been published before.

[00:37:51] And it was enough to just have long tail content. Then the backlash to that comes along, which is, you know, low quality content. And you see the kind of era of content farms, demand media, IE, how, um, all of that kind of stuff, duplicate content literal scraping of, of content. And, uh, if you go back and read hacker news threads from that era, you’ll see people complaining about stack overflow, being outranked by stack overflow scrapers in Google search results.

[00:38:17] And that, of course that’s a Panda and then like the penguin and so forth. And so. W we then start entering the era of content quality. So it goes links, it goes content. It goes content quality and, um, and into link quality. Right? So obviously through that whole era, they’re battling spam of all kinds, whether it’s, uh, contents, family links, spam, the next wave that we were late to, I think it was mobile and that, um, we all thought it was becoming a big thing.

[00:38:47] Two years after Google was all in. I think the realistic, because they’d seen it in their own data and same in Facebook. I probably missing, I feel like I’m missing something else, but anyway, and then getting into the eras of machine learning and artificial intelligence where, you know, if you go back to 2010, you find presentation has presentations from Google.

[00:39:06] It’s talking about how they don’t use that stuff in organic search that they wanted to understand their own algorithms. They were okay using it in paid search because they had a clear. Thing they were optimizing for which simply revenue. Uh, but they, they, they, they were against it in, uh, in organic search.

[00:39:24] And then of course the con there was simultaneously the need for it. And the capability of it both increased. And that was then there was a lot of people turnover. So, you know, if you look at, I forget the exact timings, but 2014 onwards, you see a lot of leadership changes. Um, and a lot of, uh, senior technical leadership changes.

[00:39:43] And the new God are the machine learning God, essentially. And clearly that’s where they’re all in right now. And so you get to kind of the content understanding and intent understanding and of things. And this is why we’re all in on testing is I think the only way to operate sustainably long-term at scale in that environment is to, um, it is you can’t just apply those kind of cookie cutter recommendations.

[00:40:09] You have to be.

[00:40:11] Kevin: [00:40:11] 100%. I think that there was the change of the old guard that was left by. I’m a single then over to Ben commerce and Jeff Dean and a couple of other people as well,

[00:40:21] Will: [00:40:21] briefly John G and Andrea, I think wasn’t there very long, but that actually was a w was an ML guy. Um, and yet Jeff Dean, I think, is the underpinning, um, probably quietly behind the scenes through all of that.

[00:40:33] Uh, of course, I mean, there was, there was some scandals buried in there, like it’s impossible to tell. Um, exactly who got pushed out for what reason? And so, you know, it’s not necessarily that change of the guard was all technical technology driven. Uh, you know, they certainly parted ways with some folks that they, they wanted to part ways with.

[00:40:49] Um, but yeah, you, you, you look at, you look at the folks who were in charge in a decade ago and compare it to the Jeff Dean’s team, for example, and you see that, see that.

[00:41:04] Kevin: [00:41:04] Yeah. And it’s very interesting. I do remember an article. On one of the big publishers. I don’t remember exactly which one it was, um, when RankBrain came out around 2016 and there was a very subtle mentioned where some Coogler, um, set that it was kind of this transition from an old culture, new guard, dog guard was led by a single, and basically they built the algorithm after his.

[00:41:29] Intuition or his understanding, which they said was amazing. Right. And he said that he just intuitively understood search on a, on a deeper level than most people. But as you, as you said, then they switched to a machine learning approach where the data basically makes the right calls. And I think in some cases, not in all cases, but some cases might even be hard for Google to understand why the machine makes a certain change.

[00:41:49] But basically they say they basically feed the machine problems and then the machine figures out what the best solution for it is.

[00:41:56] Will: [00:41:56] Yeah. So I, again, I would probably split that. I think there’s a third era in this, there’s the first era, which is the, um, we know why we’ve tuned the machine to work this way.

[00:42:07] Right. So, yeah, like you say, adolescent girls, um, intuition definitely. There’s a era before them. And then machine learning is the end state. There’s a, I think there’s near in the middle, which is it’s still deterministic algorithms that we understand. Well we’ve tuned the, the parameters, but we don’t know why this parameter is better than that parameter.

[00:42:29] So I think they did go through a phase in the middle where they, where they were repeatedly testing different tweaks to those parameters. And in fact, there’s a, there’s a great comment. I found buried in an old hacking news thread from a, um, uh, a Google had left a few years. Prior who said, I think he was there in 2008, something like the late two thousands.

[00:42:46] Um, who said he had a, uh, an eye opening moment where he, he, he wrote some code. He had a version that he thought was going to work and he went to someone like Anna who said that actually he wanted him to take the square root of one of these parameters. Uh, and he said, why. W like, I, I don’t, I don’t really get it.

[00:43:09] Like I don’t, I don’t have any intuition about why the square root would be better. Um, and the, the answer was just basically we need it to not, uh, it wasn’t really based in information withdrawal. It was like, Oh, well we just need this kind of performance. And then what we’ll do is we’ll test it. Right. So we’re going to see if these search results are better with the parameter like this.

[00:43:28] And if they are we’ll use it, we don’t need a reason why, and it, maybe it’s not square. Maybe we take a look, right? Like it’s all we need is like to dominate a bit. And, um, the, just that, that concept of kind of saying, yeah. So, uh, so I think in that first era, they, they knew what the parameters were and why they were those values in the middle era.

[00:43:48] They knew what the parameters were, but not why. And so they probably could say like, we don’t know why there’s produced that search results, but it does. And we know that this page outranks that one, because you know, this heavily weighted. Um, ranking factor. It outweighs all these others, the machine learning end it, there are so near other billions of parameters, right?

[00:44:09] They don’t even know, or you can’t even hold in a human brain. What the parameters are, nevermind what their values are. And they don’t necessarily map to any kind of sensible human. We don’t have languages for these things. We don’t have words for these things. So you can’t just say this group of parameters relates to.

[00:44:26] Quality, this group relates to relevance. This group relates to authority. Like you could do that in the old areas, but in the machine learning era, there’s just like, well, there’s just billions of parameters. And the machine is just spotting patterns that we can

[00:44:39] Kevin: [00:44:39] 100%. That’s why to me a part to make that more understandable are the, um, quality rater questions or the quarterly clearly read guidelines, independent questions that they released from time to time.

[00:44:49] That to me, kind of tries to bring to the point what these intangible. Fluid type of signals are that the machine might be looking at. And I think,

[00:44:59] Will: [00:44:59] and what they’re optimizing for, doesn’t it? I think that that’s the thing is, is they haven’t yet built the Oracle, right? Goo Google’s machine is not perfectly representing Google’s intent.

[00:45:10] And whereas the quality rater guidelines actually show you what they’re, what they’re trying to do. And I think, yes, actually I didn’t mention it earlier, but that, that is another source of our. Uh, test ideation. So thinking not, not specifically like racial guidelines or that might be one of them, but that kind of, um, you know, w what would we change if we were trying to impress the people who write the Google algorithm?

[00:45:33] Kevin: [00:45:33] Yeah.

[00:45:34] Will: [00:45:34] Right. You’re like, well, what do we think is the page they want to outperform here? And what would the test, what the test be that we think moves us closer to that ideal? 100%.

[00:45:45] Kevin: [00:45:45] I think we outlined quite a lot. Um, But I think there’s, there’s one last question, um, to, to, to wrap this conversation up, which is looking at all these changes over the last 15 years of your career, you know, the, the, the, the point where he at now, what keeps you in the game?

[00:46:03] What keeps you doing SDL after all these years after all these eras? Uh, and after having, having done so

[00:46:10] Will: [00:46:10] much. Uh, curiosity. I think fundamentally I love the fact that we’re working with, um, something that’s meaningful, right? It’s not just an academic game, this, uh, this, this drives business outcomes. It creates jobs.

[00:46:29] It, uh, you know, improves all kinds of, uh, careers and that sort of things. Yeah. But actually the thing I get the kind of the kick from is it’s the problem solving, it’s the, huh, that’s weird. Like what, why might that be true? Or, uh, you know, what, what do we think also the, what do we think might be true tomorrow?

[00:46:48] The point you’re trying to extrapolate our own trends out of this stuff. And, um, yeah, it’s, it’s a game that’s kept changing just fast enough to keep me engaged, I guess. Um, but, uh, yeah, it, it’s, it’s a great blend of. You’re technical enough with enough human interest and enough change that, that I can, I can stay curious.

[00:47:07] Kevin: [00:47:07] Here’s two more changes in the future. Uh, well, you’re certainly mess Roca craft and this, this conversation was, was sheer pleasure, uh, for me and I’m sure for many, uh, listeners as well, um, besides search pilot.com, where can people find you and follow you?

[00:47:22] Will: [00:47:22] Best is Twitter. So I’m at will Critchlow on Twitter.

[00:47:25] That’s the, uh, probably the platform I use most. And from there you can find any other way of contacting me. So yeah, start there and let’s have a chat about everything from, uh, search algorithms, split testing to auction theory or, uh, basketball.

[00:47:38] Kevin: [00:47:38] Amazing. Thank you so much for coming on board.

[00:47:42] Will: [00:47:42] Thank you for having me.