The hidden Force of Nonlinearity in Digital Marketing

Nonlinearity appears everywhere in Marketing but is hard to spot intuitively. Recognizing it helps you sort all kinds of problems. Here's how!

Humans thrive on structure and predictability - but the reality is that most things in life have an unclear relationship between cause and effect. They are nonlinear.

For example, the ...

… distribution of PageRank.

… relationship between ad spend and conversions.

… market share of the biggest tech companies (FAANG).

… number of clicks on a featured snippet versus the rest of the search results.

… increase of customer lifetime value in comparison to retention.

But our brains aren’t wired for Nonlinearity [1]. We can’t intuitively recognize it, mainly because nonlinear functions have no average. That creates a big problem: there is a powerful law at play that we don’t talk about. The consequence? We often face problems in Marketing that we can’t explain and make wrong choices based on our linear assumptions.

So, by learning more about the laws and principles behind Nonlinearity, you become a better marketer (maybe even a better human but the margin for error is too high for me to promise).

The most helpful ways to conceptualize Nonlinearity are the Pareto Principle and Power Laws. I’m going to use them throughout the article to show you how Nonlinearity occurs in Digital Marketing. If you haven’t heard of them or want to freshen up your understanding, you should read:

Power Laws and the Pareto Principle

The four basic types of power laws

Before we jump in, you need to get an understanding of the four basic types of Power Laws so you can recognize them.

1. Exponential growth

A graph showing exponential growth

You have either encountered f(x) = xb in school in exponential functions or economics as exponential growth. I didn’t like it either at the time but now understand how valuable it is.

All startups strive to grow their customer base and revenue by a multiple of what they put in*. Because most startups have low to zero marginal cost, that equation works out if they have product/market-fit. The way to get there are Flywheels and Network Effects. Both have a very similar effect that’s deeply rooted in this Power Law: increasing returns with constant input. In plain terms: more bang for your buck.

*Note that innovation grows in s-curves, not exponential functions.

2. Concavity

The negative concavity graph; decreasing slowly, then suddenly

Concavity, the technically correct term is “negative Concavity”, is most often not a good deal. It’s actually the opposite of exponential growth. Even more so, concavity has no upside - only downside - as opposed to convexity (exponential growth).

3. Diminishing Returns

The diminishing Returns graph; climbing quickly, then tapering off

Diminishing Returns are the stagnation of growth. This power law is an inverse exponential function and increasingly comes up in the Marketing world as the internet accelerates the decay of tactics. Andrew Chen called it “the law of shitty click-throughs”, a.k.a. less bang for your buck.

4. Long-Tail

The long-tail graph; falling sharply, then slowly

Most marketers, especially SEOs, are familiar with the concept of “long-tail keywords”. In a nutshell, it reflects the idea that some networks are concentrated at a few nodes and widely spread over many others.

More about The Long-Tail under “Long-tail or Pareto Principle?”.

If you’re curious to dive deeper into Power Laws, check out Power Laws and the Pareto Principle - Powerful Ideas.

With that taken care of, let’s jump into the meat and bones!

How to apply power laws to marketing

In 2014, CB Insights set out to analyze 750+ research briefs to look for power laws [2].

They found that

  • 73% of articles generated less than 1000 page views
  • 10% of content led to 53% of page views
  • the top 20% articles got 69% of page views

CB Insights used these lessons to understand what topics, formats, visualizations, and promotional tactics worked best. As such, we can say that power laws help us understand what to focus on in Marketing. They’re a tool to create but also to analyze and understand.

Let’s look at a couple of use-cases in which you can apply Power Laws.

Digital Marketing

A classic example you see through the whole web when googling “Pareto Law in Marketing” is to focus on the 20% of customers who bring 80% of the revenue. While that’s not untrue, I’d take it a step further:

The 20% of the 20% of your most active users are your early adopters.

The 20% of those (one level deeper) are your power users.

Getting to the core of your most engaged users reveals opportunities to grow quickly in the beginning, strengthen Product-Market Fit, run a/b tests, and get a deeper understanding of your target audience.

That’s helpful in two kinds of companies:

  1. Early-stage startups that are looking for rapid scale.
  2. Mature companies that want to revert feature-bloat or create a new product line for their core audience.
The 20% of the 20% of your most active users are your early adopters. The 20% of those (one level deeper) are your power users.Tweet

Look for markets that carry most of your relevant customers. When you’re starting out (early-stage startups) or expanding (internationalization), focusing on the core markets that yield the highest returns is your best bet.

Depending on if you’re selling to companies or consumers, markets can be

  • Cities
  • Countries
  • Consumer segments (personas)
  • Business sizes (SMB, mid-market, enterprise)

When expanding, for example, looking for the few cities or market segments that give you the best returns is the optimal way to start out and then move into less lucrative areas.

Focus on the few marketing channels that deliver the highest returns. Most startups scale through one channel, e.g. SEO or referral-marketing. But mature companies can benefit from that mindset as well.

The best metrics to measure the efficacy of a channel are users, leads, or revenue. When you’re short on resources, focus on what brings the highest ROI.

Lastly, when you’re working on an MVP (minimum viable product), build it for the 20% of your TAM (total addressable market) who are likely to adopt your product first. Maybe that’s less unique to marketing and more applicable to product development unless you consider distribution part of marketing.

According to Geoffrey Moore’s concept of the Chasm, you first need to land with early adopters before you can spread to other market segments. If you focus on serving the whole market too early, you might risk missing out on the early adopters and never get any traction. The key point here is that a product for early adopters looks way different than the one it evolves into when it targets the masses.


SEO is full of power laws and opportunities to leverage them. Nonlinearity everywhere!

The best use-case is probably disproportionate distribution of clicks across the SERPs (search engine results pages). You get way more clicks when ranking on #1 than on #2 and you get hardly anything for ranking below #5. The curve is even steeper when the SERP shows a featured snippet or knowledge graph integration.

That’s not news.

The hidden Nonlinearity lies in the difference between the clicks you get for the first couple of positions. Because #1 sends you so much more traffic than #2, it’s often more lucrative to get keywords on positions #4-10 into the top 3 than keywords from the second to the first page on Google.

Read that again.

The common assumption trap is to try to bring keywords that rank on the second page on Google to the first page. It is easier than getting into the top 3 but also less lucrative.

Traffic and clicks for

No other example illustrates this better than when we went from position #3 to #1 for the keyword “devops” with our DevOps microsite at Atlassian (screenshot above). Notice in the screenshot how the ranking slowly increases (green line) and how organic traffic shoots up when position 1 is hit.

Search results for “devops”, 9/12/19

In the TIPR Model, I outline why internal link graphs of websites are usually nonlinear and tend to reflect power laws:

The first two factors hint at a general problem with internal link graphs: they’re inherently imbalanced. I mentioned this earlier in conjunction with SEO power laws.

Broadly, you should strive for a more equal distribution of PageRank, with a slight preference for conversion pages. I call that the “Robin Hood principle”: you take PageRank from the strong and give it to the poor.

You’ll often find that a few pages get way more internal and external links than others. It pays off to distribute that link power more evenly throughout the site, which should be followed by a higher crawl rate and organic rankings.

Slide from Tech SEO Boost 2018 on which I show that a few pages on get much more links than others

Also, a few pages are crawled more often than others. PageRank is one of the biggest drivers of crawl rate since a few pages get the most backlinks they’re also crawled more often. That’s not optimal and should be corrected as I outline in the TIPR model. The screenshot below highlights how the majority of pages are being crawled once a week.

Slide from Tech SEO Boost 2018, where I presented the TIPR model and showed a crawl analysis of

20% of your organic keywords bring 80% of your traffic. When I mapped the clicks per keyword of a startup that gets ~2.5M monthly visits to its site, I saw a clear power-law (see below). Take into account that I took out brand keywords because they attracted so many clicks that make the graph impossible to read.

Mapping clicks per keyword for large survey site with millions of visitors

The same type of fat-tailed curve appeared for a large site in the health space:

Mapping clicks per keyword for large health site

I challenge you to try this for yourself. Export your top keywords, remove brand keywords and map against clicks on a scatter plot chart.

That’s the long-tail power law in action!

Let me take this to the next level:

If it holds true that 20% of keywords bring 80% of traffic you can simply look at the strongest keywords of your competitor(s) to find their jewels.

Here’s how to do this with AHREFS’ organic keywords report

  1. Filter the keywords for position 1-11 to get only their best rankings.
  2. Exclude the brand from keywords (not URLs).
  3. Sort the list after traffic.
AHREFS organic keywords report. Notice how the traffic per keyword is much higher for the first few and then drops - another power distribution.

Boom, you got the top competitor keywords at the top of the list!

Now you can reverse engineer how your competitor’s best URLs get the top 20% of their best links and systematically go after them.

Let’s turn this around: if 20% of keywords deliver 80% of traffic to your site, 20% of pages get 80% of traffic. That should help you prioritize which pages to monitor and optimize first, especially when you deal with a large number of pages.

Another way to look at it is to say 20% of pages generate 80% of revenue, especially for centralized sites (more under Long-tail or Pareto Principle?).


Just as described in the example at the beginning of this chapter, Power Laws can tell you what content is best optimized, linked and what your audience might be most interested in.

Sessions per landing page of

Looking at the traffic to my site, I see the common power law of a few URLs getting most traffic. To be fair, I haven’t normalized for the uptime of each URL, so there might be some noise in the data. But the point should be clear. I can now go ahead and further optimize the strongest pages or compare the 20% to the other 80% and see if I notice any patterns that would tell me why the 80% isn’t performing better.

Let’s look at a couple of statistics beyond that to understand the content marketing landscape at greater scale:

In 2018, BuzzSumo published a report that found 5% of content gets 343 shares on average - the rest gets nothing [3]. This illustrates a fine example of a Power Law (screenshot below).

BuzzSumo 2018 traffic report showing a power law for number of articles vs. number of shares.

AHREFS discovered that 91% of one billion pages get no traffic from Google - only .3% get more than 1,000 visits [4]!

Backlinko revealed that 94% of all blog posts get no links and only 2.2% of content gets links from multiple sites! The report also validated the conclusion of the BuzzSumo report: 1.3% of articles get 75% of social shares.

Concluding, your piece of content is either hit or miss. There is no room for mediocre content. It has to be outstanding. In the same regards, those stats should help you evaluate your own content success and the success of your competitors.


Ads, whether on Google or Facebook, reveal power laws as well.

20% of paid keywords bring 80% of traffic, just like the organic counterpart. Look for the few keywords that get the most clicks when optimizing budgets and prune the 80% that get the lowest conversions or clicks when budget is low or you want to optimize your campaigns.

According to the principle of diminishing returns, spending more money on ads will not scale indefinitely. At some point, the most relevant keywords are saturated and your bids yield fewer clicks.

The optimal way to find the sweet spot of keyword ad spend is to look at the incremental from additional spend. In other words, look how much more traffic/conversions you get by increasing the money you bid on keywords. It takes a bit of trial and error but can save huge amounts of money.


Power Laws appear in three counter-intuitive ways in Conversion Rate Optimization (CRO).

First, 10 user tests can tell you as much as 100. This is the principle of diminishing returns in action: after 10 user tests, the additional value you get from each additional test is so small that you don’t need more to find the big issues.

ClickToTweet: 10 user tests can tell you as much as 100.

Second, focus 80% of experiments on the 20% of pages that generate the most revenue. Most pages on a site are not worth running experiments on because a potential improvement has no profound impact on revenue.

Third, only 1 in 10 experiments succeed. Expect most experiments to not have a statistically significant conclusion. But those that do move the needle. Be mindful of that when you fill your experiment backlog or when you doubt your approach after a series of unsuccessful experiments.

There you have it. Those are the most common use-cases of Nonlinearity in Digital Marketing but I’m sure there are way more. If one comes to mind I’d love you to share it with me on Twitter!

I could end the article here - but there’s an interesting tension between two Power Laws that are at odds.

*cinematic trailer voice*

In a world of long-tail versus Pareto Principle...

Long-tail or Pareto Principle?

A study from 2011 found an interesting occurrence of the phenomena of the long-tail in ecommerce [6].

In the experiment, the team:

  1. analyzed the product inventory of a medium-sized women’s clothing retailer that sells online and offline
  2. normalized their sample of online and offline catalogs for product availability and price
  3. measured the use of site search and recommendations with server log files to understand if they facilitate the sales of long-tail products

The results?

Internet sales were more evenly distributed across the product inventory than offline sales, even when normalized for price and availability. In other words, people buy way more niche products online than offline.

The researchers came up with two explanations:

  1. The internet can carry more products than offline retail (supply).
  2. The internet provides improved discovery through recommendation engines and personalization (demand).

Both hold true, in fact.

It’s no surprise that the internet makes more products through recommendations, site search, and personalization accessible than the offline world ever could. And that’s powerful because that gave birth to markets that have such a fat tail that its sum is larger than the short-head also known as the long-tail.

In some markets, you can make more money from niche products than from mainstream products. And this is not even new. The long-tail phenomenon was outlined by Bryson in 1974 - “a heavy-tailed distribution has a tail that’s heavier than an exponential distribution, i.e. it goes slower to zero” - and in 2004 in a Wired article by Chris Anderson:

What’s really amazing about the Long Tail is the sheer size of it. Combine enough nonhits on the Long Tail and you’ve got a market bigger than the hits. Take books: The average Barnes & Noble carries 130,000 titles. Yet more than half of Amazon’s book sales come from outside its top 130,000 titles. Consider the implication: If the Amazon statistics are any guide, the market for books that are not even sold in the average bookstore is larger than the market for those that are.

That article became so famous that Anderson turned it into a successful book.

Just like the Brynjolfsson study, Anderson’s article recommends businesses that want to profit from the long-tail to do three things:

  1. Offer users every product that’s on the market to harvest the power of the long-tail.
  2. Cut the price of long-tail product rigidly to pull users in.
  3. Help users find it.

Again, this doesn’t come as a surprise 15 years later. But what is fascinating is how this concept of the long-tail is the opposite of an exponential function.

In an environment of exponential growth, there can really be only one winner. In an environment that provides a long-tail, there’s a big opportunity to go broad with potentially multiple winners.

The key is choice. Fewer choices tend to create winner-takes-it-all markets. More, or near indefinite, choices set the foundation for a long-tail.

When consumers consider a wide variety of products, the market reflects the WTA outcome that we tend to associate with increasing returns. When consumers limit their choice, and thus hold smaller feasible sets, the market reflects the flat distribution of the traditional spatial model.[8]
Fewer choices tend to create winner-takes-it-all markets. More, or near indefinite, choices likely come with a long-tail.Tweet

The technical terms for platforms with (near) unlimited choices is scale-free networks. They provide (near) infinite possibilities for a node to connect with other nodes. When possibilities are limited, connections tend to concentrate at one point.

Make no mistake, there is a lucrative short-head that comes with long-tail distributions. But, other than in winner-takes-it all environments or Pareto Distributions, the is not the only way to high ROI.

Products on the internet, like songs on Spotify or movies on Netflix, have a long-tail. They provide so many options that they’re likely to serve niches. But when it comes to the best ride-hailing app or search engine, the market consolidates over time and one winner gets the majority of the market.

Conclusion: In case of doubt, follow the Barbell Strategy

Nonlinearity in Digital Marketing is a hidden but powerful concept. No matter which Marketing channel, Power Laws and Pareto Principles are everywhere. The awareness and understanding of them help you solve problems faster and recognize patterns.

I want to close this article with a look at when to disobey them and what to do instead. It’s not that I wrote 3,181 words just to say you should ignore Nonlinearity. It’s more that you must first learn the rules before you can break them.

Joseph Juran, who I mention in “An introduction to Power Laws and the Pareto Principle”, called the Pareto Principle the law of "The vital few and the useful many" to underline that the 80% is not always useless. Cutting them out is sometimes necessary but also a path to addiction.

There is a certain risk involved in betting on the 20% - not despite but because they hold 80% of the returns. If something goes wrong with the 20%, you lose. Say, you decide to cut 80% of customers and focus on the 20% that yield the highest returns and those 20% are two or three accounts, you’re running a risky business.

So, what to do?

Follow a Barbell Strategy: Bet even stakes on the 20% and 80%. That’s what investors do in uncertain environments - they diversify.

barbell strategy
Investing 50/50 with a Barbell Strategy

The Barbell Strategy is a common approach to dealing with uncertain environments, a.k.a. Nonlinearity. It has two high ends and nothing in the middle, i.e. you cut out the average.

If you have to thrive in uncertain environments, bet 50% on one channel and diversify the rest. In Pareto Terms, invest half of your input in the 20% that generate 80% of the outcome and the other half into the remaining 80%. If you can or must take risks, focus on the 20% that yield 80% of returns and cut the rest out.

If you want to maximize your chances, bet 50% on one channel and diversify the rest.Tweet


  1. If you think recognizing Nonlinearity is not a problem for you, check out the wonderful examples and exercise in this HBR article from 2017:
  2. CB Insights investigation of +750 and the Power Laws they found:
  3. BuzzSumo traffic trend report 2018:
  4. AHREFS search traffic study from 2018:
  5. Backlinko content study from 2019:
  6. Brynjolfsson et al. study from From 2011 called “Goodbye Pareto Principle, Hello Long Tail: The Effect of Search Costs on the Concentration of Product Sales”:
  7. Wired article by Chris Anderson from 2004:
  8. Winner-take-all or long tail? A behavioral model of markets with increasing returns”: