Research|April 30, 20256 min read

Two studies gave opposite answers on ideal content length for AI citations. Here is how to read both.

BetterAISearch Editorial Team

BetterAISearch

AirOps analyzed 815,484 pages and found the citation sweet spot is 500 to 2,000 words. The Digital Bloom analyzed 30 million citations and found 10,000-word pages get 62 times more citations than shorter equivalents. Both studies are correct. They are measuring different things, and understanding the difference determines how you structure content for AI search.

62x

more citations for 10,000-word pages vs shorter content in raw count

But per-query citation rate peaks at 500 to 2,000 words. The gap is rate vs volume.

Study 1: AirOps, 815,484 pages

AirOps published analysis in April 2026 covering 815,484 retrieved pages and 16,851 ChatGPT queries. The study measured citation rate: what percentage of the time a page was retrieved did it end up cited in the final ChatGPT response.

The sweet spot: 500 to 2,000 words with 7 to 20 subheadings. Pages in this range were cited more consistently than pages outside it, even accounting for domain authority differences.

The study also identified what it called the "ultimate guide paradox." Long-form comprehensive pages with the highest word counts, the most headings, and the highest domain authority in the dataset were among the least reliable performers in terms of citation rate. They are retrieved often. They are cited at a lower rate once retrieved.

The interpretation: ChatGPT retrieves roughly 6 to 7 pages for every one it cites. 85% of retrieved content is discarded before the answer is written. A 10,000-word page covering everything is harder for the model to extract a precise, targeted answer from. A focused 1,200-word piece that answers one question cleanly is easier to cite for that question.

Study 2: The Digital Bloom, 30 million citations

The Digital Bloom analyzed 30 million AI citations in December 2024 to assess content format performance. The study compared a 10,000-word piece with a Flesch Reading Ease score of approximately 55 against a shorter equivalent on the same topic.

Result: 187 total citations for the long-form piece versus 3 for the shorter equivalent. That is 62 times more citations in raw count.

This appears to contradict the AirOps finding. It does not, once you look at what each study is measuring.

The reconciliation: citation rate versus citation volume

AirOps measures citation rate per query: for any specific question, what percentage of retrieved pages get cited. This rewards focused, extractable content.

The Digital Bloom measures total citations accumulated over time across many queries. A 10,000-word comprehensive page on AI search optimization answers not just one query but potentially 20 or 30 distinct sub-questions. Each time ChatGPT, Perplexity, or Google AI Overviews generates a response on any one of those sub-topics, the long-form page is a candidate.

Metric	Favors shorter content (500-2,000 words)	Favors longer content (10,000+ words)
Citation rate per query	Higher	Lower
Total citations over time	Lower	Higher
Number of queries covered	Fewer	Many
Extractability per section	Higher	Variable
Domain authority signal	Neutral	Stronger over time

Source: AirOps (n=815,484), The Digital Bloom (n=30M+ citations)

A focused 1,200-word piece on "does author attribution affect ChatGPT citations" will win the per-query citation race for that specific question. A comprehensive guide on AI search optimisation will accumulate more total citations over six months by being retrievable across dozens of related sub-queries.

Both are valid. The choice depends on whether you are targeting a specific query (short, focused) or building topical authority (long, comprehensive and modular).

The readability constraint both studies agree on

AirOps analyzed 353,799 pages for readability using Flesch-Kincaid grade level scoring. The citation peak was at grade level 16 to 17, achieving a 35.9% citation rate. This corresponds to college-level writing: complex enough to demonstrate expertise, structured enough to be extractable.

Pages with Flesch Reading Ease scores of 50 or higher appeared more frequently in ChatGPT citations. That score range corresponds to "fairly difficult" to "standard" reading level in the Flesch scale, which maps to the kind of writing you find in industry publications and research summaries.

The finding cuts against the common advice to "write for a 6th-grade reading level" for content discovery. AI systems do not reward oversimplification. They prefer precision: specific terminology used correctly, sources cited explicitly, claims stated with appropriate qualification.

The format that works across both dimensions

Growth Memo analysis of 21,482 ChatGPT citations found that 58% of cited URLs are cited only once. The top 4.8% of URLs, cited 10 or more times each, were all category-level pages covering multiple intents in a single URL: what the topic is, who uses it, how to choose, and what it costs, all in one place.

This is the format that reconciles both studies. Long-form, multi-intent coverage (which generates high total citation volume) structured as modular H2 sections (each independently extractable for per-query citation) outperforms both pure long-form and pure short-form approaches.

Microsoft documented this directly: strong descriptive headings are "signals that help AI know where a complete idea starts and ends." Each H2 section should be independently understandable without context from surrounding sections. Write each section as if it is the only thing an AI system will read, because for any given query, it may be.

What to implement

If targeting a specific query with a focused piece: 500 to 2,000 words, 7 to 12 subheadings, each section answering one clear sub-question. Prioritise extractability over comprehensiveness.

If building topical authority across a subject area: long-form with each H2 section written as an independently extractable unit. Target Flesch Reading Ease 50 or above. Include a visible publication date and at least one specific number in the first paragraph. Growth Memo found that DATE and NUMBER are the two strongest positive entity signals in a page's first 1,000 characters for AI citation selection.

The mistake to avoid: writing long content without structural clarity. Long and unstructured produces worse per-query citation rates than short and focused. Length only delivers its citation volume advantage when the headings create genuine extraction points for AI systems.

→ChatGPT SEO: what the controlled research actually shows →Generative engine optimisation (GEO): the full evidence base

Frequently asked questions

What is the ideal content length for AI citations?

It depends on what you are optimising for. For citation rate on a specific query, AirOps analysis of 815,484 pages found 500 to 2,000 words with 7 to 20 subheadings performed best. For total citation volume over time, longer comprehensive content accumulates more citations by covering multiple sub-queries. The most effective approach is long-form content structured as independently extractable sections.

Do longer articles get more AI citations?

In raw citation count, yes. The Digital Bloom found that 10,000-word pages with a Flesch score around 55 received 187 total citations versus 3 for shorter equivalents. But in per-query citation rate, shorter focused pages (500 to 2,000 words) outperform. Long-form content covers more queries, accumulating more total citations across many topics.

What word count performs best in ChatGPT citations?

AirOps analysis of 815,484 pages found that 500 to 2,000 words with 7 to 20 subheadings produced the highest consistent citation rate in ChatGPT. Notably, the study found that long-form "ultimate guide" pages with the highest word counts, most headings, and highest domain authority were among the least reliable performers despite their scale.

Does content readability affect AI citation rates?

Yes. AirOps analysis of 353,799 pages found that Flesch-Kincaid grade level 16 to 17 achieved a 35.9% citation rate, outperforming both simpler and more complex prose. Pages with Flesch Reading Ease scores of 50 or higher appeared more frequently in ChatGPT citations. College-level writing precision rather than simplified content is what AI systems prefer.

Related tactics in the database

Heading structure→Content readability→Topical content clusters→

About the author

BetterAISearch Editorial Team

BetterAISearch

The BetterAISearch team synthesises peer-reviewed studies, platform documentation, and independent research into actionable, scored tactics.

←All research posts