How Google tests new content in the search results

This article is a case study that shows how Google ranks new content for different keywords to test its quality.

Nov 02, 2020

How Google tests new content in the search results

We anecdotally know that Google is continuously testing content and shuffling the search results, but it’s tough to find hard data about it. I used Ryte to pull all available data from Search Console to show how Google tests the relevance and quality of content at the hand of three articles.

Google tests new content on different positions for all queries it deems relevant. Interestingly, this understanding is very imprecise at first and then gets better over time. Content first ranks for many queries on lower positions, and then for fewer queries at higher positions if the content is of high quality.

I looked at three articles from my blog and used Ryte to pull Search Console data for this small analysis. Ryte also tells me which outliers outperform the average CTR and how many URLs rank for the same keyword - super useful. Of course, take it with a grain of salt because n = 3 isn't much.

We see the same pattern over and over: a sharp spike followed by a drop and a gradual increase (if things go well). Note that the screenshots below show the number of keywords for the URL.

Example 1: Why I left Substack and the email renaissance

Total number of keywords for my Substack article in Ryte

Example 2: The impact of GPT 3 on Google Search, a complex adaptive system

Total number of keywords for my GPT 3 article in Ryte

Example 3: Internal linking - the full guide to internal link axioms

Total number of keywords for my internal linking article in Ryte

Natural SERP volatility

This behavior creates natural volatility in the search results. Google constantly shuffles the results, measures the impact, and then recalibrates. The higher a query ranks, the lower the chance for it to drop again, but it's not impossible.

Compare the rankings of the two keywords, "growth levers" and "land and expand saas", for example.

The ranking for "growth levers" is steady once it reaches the top position

The ranking for "land and expand saas" fluctuates heavily

As you can see, Google is confident that my article about "growth levers" qualifies for a top position. It just fluctuates in the beginning. "land and expand saas" is a different beast, however. Here, Google constantly switches my content out for other pieces and sees if they're better.

Impressions and # of keywords have a high correlation

Impressions closely follow the number of keywords. It makes sense: the more keywords you rank for, the more impressions you get.

At the same time, the average position across all queries runs counter to its number in the beginning because new content tends to rank lower until Google figured out where to place it.

Average ranking vs. impressions and # of keywords in Ryte

Keyword testing

Ryte also allows me to track and compare specific keywords that a page ranks or did rank for.

Keyword comparison for my Substack article

Google recognizes that my article about Substack and the email renaissance is very helpful for the query "mailchimp vs substack", even though none of these keywords appear in the title. This is the principle of user intent at work. The article is not very relevant for "paid newsletter", though, which is why the article ranks lower and lower for that keyword about four weeks after publishing.

For some queries, Google can identify the relevance and authority right away and rank a page high. I call them "surefire" keywords (see an example above).

Based on entity-understanding, Google "understands" that my article is relevant for keywords like "google gpt3" but not "gpt-3" itself. It never got into the top 10 and started tanking about one month after publishing (see above).

Competitive and high-volume queries often have a sort of "probation period". About two weeks after publishing, the article starts to rank for high-volume queries (see below). Sometimes, they start on page 10 ("internal links audit"), sometimes on page one ("internal links best practices").

Keyword ranking comparison between "internal linking best practices" and "internal links audit"

Once they "grooved in", rankings don't fluctuate as much anymore. Google has made a decision where the page fits in for a keyword and sticks to it until the page gains more links or someone else published a piece of content that competes with it (or Google updates its understanding of entities).

Conclusion: content needs to prove itself; nothing is guaranteed

I observed that It takes Google 3-4 days to figure out where content should rank initially. From there, Google keeps testing how the piece of content would perform for different keywords throughout its lifecycle, one impression at a time.

In the beginning, the testing is very broad and "wild", then it becomes more refined. Notice in the screenshot above, for example, how Google ranked the article for "stratechery podcast" and then realized it's not a good result. What's happening is that Google adds new content to its index and then needs to determine its relevance for many queries. That takes a while.

The way I explain this to myself is that Google regularly updates its understanding of the relationship between entities and relevance of content for queries. This is also why we see almost monthly updates and often fail to understand what they did. I think about it as an update of the definition of relevance.

Growth Memo

How Google tests new content in the search results

This article is a case study that shows how Google ranks new content for different keywords to test its quality.

Natural SERP volatility

Keyword testing

Conclusion: content needs to prove itself; nothing is guaranteed