Information Gain Is the 2026 Ranking Lever. Generic AI Content Is Suppressed.

The winning move is not more pages or more keywords. It is one genuinely new, first-hand thing per page. Everything else gets quietly shelved.

ByAditya Sharma·Jun 1, 2026

DOSSIERJUN 1, 2026 · ADITYA SHARMA

When both Google and Bing independently park three-quarters of your pages and hold flat for a month, that is not a plumbing bug. It is a quality verdict on originality.

— AutoKaam field measurement, May 2026

What AutoKaam Thinks

Information Gain is the central 2026 signal. Google's patent, granted in 2024, scores a page on what it adds that the rest of the results for that query do not already have. A reword of the top ten…
The reported impact is one-directional. Proprietary first-hand data pages up 15 to 25 percent, templated rewrites down 30 to 50 percent, AI-farm content down 60 to 80 percent. Sites publishing at s…
My own receipt: a dev site I run sat at 25 percent indexation, 62 of 251 URLs, flat for 30-plus days on both Google and Bing. Verified indexed, crawled, sitemaps green. The block was originality, n…
The fix is a first-hand hook on every page. A real number you measured, a thing that broke and how you fixed it, a primary-source fact a competitor cannot copy. This is the one durable moat against…

-60-80%

Reported ranking impact on AI-farm content under Information Gain

CONTENT OPERATORS + AI-WRITERS

Named stake

The lever that ranks a page in 2026 is not its length, not its keyword density, and not how many of them you publish a week. It is Information Gain: whether the page carries something the rest of the results for that query do not already have. Everything I run, and everything I watch get suppressed, points at the same verdict. Originality is the signal now. Volume is a liability.

This is the operator read on why generic AI content is quietly dying in the index, with the receipt off a property I operate.

The evidence

Google's Information Gain patent was granted in 2024. The mechanic is blunt: a page is scored against the existing corpus of results for a query, and it earns a position by what it adds on top of what is already ranking. Proprietary data, a lived case, an original framework, a freshness hook, a named expert with real involvement, all of these raise the score. A reworded version of the current top ten adds nothing the model cannot already find, so it scores zero.

The reported impact is one-directional and harsh:

Content type	Reported ranking impact
Proprietary, first-hand data pages	+15 to 25%
Templated rewrites of the SERP top ten	-30 to 50%
AI-farm content, scaled, no value-add	-60 to 80%

It does not stop at the patent. The Quality Rater Guidelines update of September 2025 tells human raters to give the lowest rating to main content that is auto-generated with little added value, and to actively look for the tells: invented references, inconsistent sentences, mass-production patterns, no editorial review. Scaled Content Abuse is now codified policy, and the target is explicitly the behaviour, not the tool. Sites pushing hundreds or thousands of pages off one scaffold with no editorial oversight saw 50 to 80 percent traffic drops through early 2026.

The thing to internalise is that the penalty is behavioural, not anti-AI. Google has said authorship by a model is not itself a signal. What gets you is the pattern that correlates with lazy AI use: no first-hand element, a title shared with fifty other pages on the same site, an author with no verifiable bio, a date refreshed without a substantive edit. You can trip every one of those with a human writer too. The model just makes it cheap to trip all of them at once.

My own receipt

I run a portfolio of content sites. One of the newer ones, a developer-tools site, became the cautionary half of this story.

The plumbing was textbook. Verified as indexed and crawled on both engines, all sitemaps returning success, IndexNow firing on every publish, robots clean, a valid dateModified on every page. By every dashboard a junior would open, the site was healthy.

It sat at 25 percent indexation. Sixty-two of 251 content URLs in the index, holding flat for more than 30 days, on Google and Bing at the same time. When two independent engines each decide to park three-quarters of your pages and then do not move for a month, you can stop hunting for a crawl bug. This is a quality verdict.

The diagnosis was not technical and it was not length. The site's median article was over a thousand words. The problem was that the articles were a generic AI rewrite of documentation I happen to run in production every single day. Every page had zero first-hand element. Not one of them said anything you could not have written without ever touching the stack. To an Information Gain model, a page like this is a duplicate of the corpus it already has, so it gets crawled, indexed in dribs, and ranked nowhere. This is exactly what soft-suppression looks like from the operator's chair. There is no penalty notice. The pages just quietly do not count.

The fix that I am still rolling out is not a plugin and not a schema tweak. It is rewriting each weak page around something only I can say, mined from my own operating logs. The first rebuild swapped doc-paraphrase for the actual numbers and the undocumented gotcha I hit running that exact stack. This is the move. Everything else is theatre.

What Information Gain actually means

Strip the patent language and Information Gain reduces to one test you can run on any draft. Read the page and ask: what is on here that a reader could not get from the ten results already ranking for this query?

If the honest answer is nothing, the page is dead on arrival. A real hook is one of these:

A number you measured yourself. Not a stat you found and re-cited, a figure off your own property, your own test, your own ledger.
A failure you actually had. The thing that broke, the wrong turn, the cost you paid, and how you got out. Models cannot invent your specific scar tissue, and when they try, it reads fake.
A primary-source fact a competitor structurally cannot copy. Your own field data, an original framework you built, a defensible opinion with your name on it.

Notice what is not on that list. Word count is not a hook, and Google has said so directly: a thin page made longer is still a thin page. An About link is not a hook. Schema markup gets you SERP features, not a ranking. A title following the same pattern as fifty siblings is the opposite of a hook, it is the exact fingerprint of scaled abuse the raters are told to flag. The whole category of "rewrite the top result in your own words" is now a negative move, because the model can see it is a reword and discount it accordingly.

The operator playbook

The method I am applying across the portfolio is two-part, and this article is the method applied to itself.

First, audit for the absence of first-hand signal, not for length or keywords. On the dev site, the tell was simple: pages with one or zero genuine first-person observations were the ones stuck out of the index. So the rule became, no page ships without a first-hand hook, full stop. No hook, no publish. A pre-publish check that scores a draft on the unhelpful markers, no original element, a templated title, an unverifiable author, a duplicate of the corpus, hard-blocks anything that fails. The point is to make the lazy page impossible to ship, not to wish for better writing.

Second, mine your own history for the specifics a model cannot fabricate. This is where an operator with real receipts beats an infinite AI content farm, and it is the entire moat. A content factory can generate ten thousand pages an hour, but it cannot generate your measured 25 percent indexation, the exact thing that broke in your production stack last Tuesday, or the cost you actually paid for a service. I keep a first-hand operator log precisely so that every article can be anchored to something true and unique, the same way I treat a git-versioned memory substrate as an asset competitors cannot replicate. The factory's scale is its weakness. It produces exactly the duplicate-of-the-corpus content the 2024 patent is built to discount, at volume, which is why scaled AI sites took the worst of the 60 to 80 percent drops.

The same discipline pays off on the AI-search side, not just classic ranking. Clean, original, well-formed pages are what get cited by ChatGPT Search, Copilot, and Perplexity, which read off the same indexes. Originality is now the shared currency for both blue links and answer-engine citations. Get the Bing-side plumbing right too, because that index is the pipe into AI search, but understand that plumbing only delivers pages worth citing if the content carries gain in the first place.

The verdict

In 2026 the winning lever is not more content and it is not more keywords. It is Information Gain, one genuinely new and first-hand thing per page, and generic AI rewrites get soft-suppressed precisely because a model can tell they add nothing. The numbers are not subtle. First-hand proprietary pages up 15 to 25 percent, templated rewrites down 30 to 50, AI-farm content down 60 to 80. I watched it happen on a property I run: 25 percent indexation, flat for a month, killed by originality and not by any pipe.

The instruction that follows is simple and most operators will still resist it because it does not scale on a button press. Before you publish another page, find the one true thing on it that no competitor can copy. If there is not one, do not ship it. The infinite-AI-content era has a floor, and the floor is whatever you have actually measured, broken, paid for, or proven yourself.

Topics

#SEO #Google

Adjacent

Information Gain Is the 2026 Ranking Lever. Generic AI Content Is Suppressed.

The evidence

My own receipt

What Information Gain actually means

The operator playbook

The verdict

Related

More from the same beat.

xAI Distills OpenAI: Musk's Trial Confession Torches the Frontier-Lab Moat

vLLM at 10K QPS: What 50-Engineer ML Teams Learned Scaling Open-Weight LLM Inference

40 Models Don't Win the Procurement Call. Five Do.