JSON-LD, the format Google recommends for structured data, scored zero out of six for readability across the six major AI crawlers that Writesonic tested. If your AI optimisation strategy is built on schema markup, this study changes the calculation.
What the study tested
In March 2026, Writesonic published a systematic test of 62 webpage elements across six major AI crawlers: GPTBot (OpenAI), ClaudeBot (Anthropic), PerplexityBot, Gemini crawler (Google), Meta-ExternalAgent, and Applebot. For each element, they scored whether the crawler could reliably extract the information. The score was the number of crawlers out of six that read the element accurately.
The results are significantly different from what most AI optimisation advice assumes.
What AI crawlers cannot read
The worst-performing category was metadata embedded in script tags. JSON-LD scored zero. Open Graph tags scored zero. Meta descriptions scored zero. Twitter card tags scored zero. These are among the most commonly recommended technical SEO elements, and none of them appear to be inputs that AI crawlers are using.
JavaScript-rendered content scored poorly across the board. Pages where the body content is populated by React, Vue, or similar frameworks — and is empty in the HTML source — were largely unreadable. This has significant implications for single-page applications and any site that relies on client-side rendering for its primary content.
| Element type | AI crawler readability (out of 6) | Notes |
|---|---|---|
| Title tag | 5 / 6 | Only metadata element with strong cross-crawler readability |
| Visible body text | 6 / 6 | Primary input for all crawlers; must be in HTML source |
| H1 / H2 / H3 headings | 6 / 6 | Semantic heading structure strongly readable |
| Alt text (images) | 4 / 6 | Varies by crawler; GPTBot and Perplexitybot strongest |
| JSON-LD structured data | 0 / 6 | Not read by any tested crawler |
| Meta description | 0 / 6 | Not extracted by AI crawlers in tests |
| Open Graph tags | 0 / 6 | Not extracted by AI crawlers in tests |
| JS-rendered content | 1 / 6 | Most AI crawlers do not execute JavaScript |
Source: Writesonic, 62 webpage elements, 6 AI crawlers, March 2026 (abridged)
What AI crawlers do read
Visible body text scored six out of six. Every crawler tested could read the text that appears on the page for a human reader. This is the primary content channel for AI retrieval.
Semantic heading structure scored six out of six. H1, H2, and H3 tags were reliably parsed across all crawlers. This confirms what content structure research suggests: clear heading hierarchy is not just a user experience signal, it is a machine-readability signal for AI systems.
The title tag scored five out of six — the only metadata element with meaningful AI crawler readability. Meta-ExternalAgent (Meta's crawler) was the exception. For all other major crawlers, the title tag is the one metadata element you should ensure is accurate and descriptive.
Image alt text scored four out of six. GPTBot and PerplexityBot showed the strongest alt text readability. For image-heavy content, alt text remains a worthwhile signal — but not for all crawlers.
What a proper AI crawlability audit should check
A traditional technical SEO audit focuses on crawl errors, redirect chains, canonical tags, and structured data validation. An AI crawlability audit has a different checklist.
1. Robots.txt permissions for AI crawlers
The first check is whether your robots.txt is blocking AI crawlers. The major crawlers and their user-agent strings are listed below. Any disallow rule that matches these agents prevents those AI systems from indexing your content for potential citations.
User-agent: GPTBot Allow: / User-agent: ChatGPT-User Allow: / User-agent: ClaudeBot Allow: / User-agent: anthropic-ai Allow: / User-agent: PerplexityBot Allow: / User-agent: Google-Extended Allow: / User-agent: Meta-ExternalAgent Allow: /
2. Server-side vs client-side rendering
If your primary content is rendered by JavaScript, check whether AI crawlers can access it. The simplest test is to view the page source (Ctrl+U in Chrome) and search for your key body text. If the text is not present in the raw HTML, most AI crawlers are not reading it.
Next.js, Nuxt, and similar frameworks with server-side rendering (SSR) or static generation (SSG) produce HTML that AI crawlers can read. Client-side-only rendering (CSR) does not.
3. Heading hierarchy and content structure
Each page should have exactly one H1 that describes the primary topic. H2s should represent major subtopics. H3s should represent subdivisions within H2 sections. Heading text should be descriptive and include relevant topic terms — AI crawlers use headings to understand content structure in the same way users do.
AirOps analysis of 815,484 AI-cited pages found that pages with 7 to 20 subheadings achieved higher citation rates than pages with fewer or more. Over-structuring and under-structuring both reduce citation probability.
4. Crawl rate and server stability
AI crawlers typically use the crawl-delay directive in robots.txt and respect Retry-After headers. If your server responds slowly or intermittently, crawlers may skip pages or fail to complete a crawl. Check your server logs for GPTBot, ClaudeBot, and PerplexityBot access patterns to confirm they are completing successful crawls.
5. Content accessibility — no login walls, no paywalls
AI crawlers do not authenticate. Any content behind a login, paywall, or cookie consent gate that prevents page load is invisible to them. If you want AI systems to cite specific content, that content must be accessible without authentication.
What schema markup still does
JSON-LD not being read by AI crawlers does not mean schema markup is worthless. It means the value is in a different channel.
Schema markup benefits Google AI Overviews indirectly, because Google AI Overviews builds on Google Search infrastructure that does read structured data. For AIO specifically, Article and Person schema remain relevant. For ChatGPT, Anthropic, and Perplexity, the evidence suggests schema is not a direct input to their retrieval systems.
The more productive reframe: visible content signals are the universal AI crawlability layer. Schema markup is a Google-specific amplifier. Both have value; they operate in different channels.
The bottom line
AI crawlability is fundamentally different from traditional search crawlability. The technical elements that matter most are the ones visible to human readers: body text, heading structure, title tags, and clean HTML. The metadata layer that traditional SEO builds on — JSON-LD, Open Graph, meta descriptions — does not appear to reach most AI crawlers.
Run an AI crawlability audit that starts with what crawlers can actually read, not what traditional SEO tooling tells you to add. The two audits have significant overlap in outcomes — both reward structured, accessible, clearly attributed content — but the checklist is different.
