DEV Community

Cover image for llms.txt, Schema Markup, and Technical GEO—What Actually Works in 2026
geobuddy
geobuddy

Posted on • Originally published at geobuddy.co

llms.txt, Schema Markup, and Technical GEO—What Actually Works in 2026

The GEO space has a hype problem. Every week there's a new "must-do" technical optimization that promises to unlock AI visibility. Most of them are unproven. Some are actively counterproductive.

I've spent the last three months testing every major technical GEO recommendation I could find. Here's what actually moves the needle, what's promising but unproven, and what you should ignore entirely.

What's Actually Proven

Structured Data / Schema Markup

Verdict: Worth doing. Clear evidence of impact.

This is the most well-supported technical GEO optimization. Multiple studies have confirmed that structured data helps AI models understand and accurately represent your content.

The standout finding: Semrush tested how GPT-4 processed content with and without Schema markup. Accuracy of information extraction jumped from 16% to 54% when proper Schema was implemented.

That's not a marginal improvement—it's a fundamental shift in how well the AI understands your content.

What to implement:

  • Organization schema — who you are, what you do
  • Product schema — features, pricing, availability
  • FAQ schema — common questions and answers (these get cited frequently)
  • Review/Rating schema — aggregate ratings and individual reviews
  • How-To schema — step-by-step processes
  • Article schema — for blog content and publications

The key insight: Schema doesn't just help Google. It helps any AI system that processes your pages. When ChatGPT's browsing feature visits your site, or when Perplexity crawls it, structured data makes your content machine-parseable.

Content Structure

Verdict: Essential. The foundation of technical GEO.

This isn't new advice, but it's more important than ever. AI models extract information more reliably from well-structured content:

  • Clear H2/H3 hierarchy — AI uses headings to understand topic structure
  • Concise, factual paragraphs — one key point per paragraph
  • Lists and tables — highly parseable formats that AI models love
  • Explicit definitions — "X is [clear definition]" patterns get extracted frequently
  • Data with sources — specific numbers with attribution get cited

Digidop analyzed 1,000 pages that were frequently cited by AI and found structural patterns: short paragraphs (average 3 sentences), heavy use of lists, and explicit question-answer formats were universal.

What's Promising But Unproven

llms.txt

Verdict: Probably worth adding, but don't expect miracles.

llms.txt is a proposed standard (similar to robots.txt) that tells AI crawlers about your site's content structure, key pages, and how to interpret your content. As of early 2026, 844,000 websites have adopted it.

That adoption number sounds impressive. But here's the honest assessment:

What we know:

  • It's easy to implement (a single text file in your root directory)
  • Major AI companies are aware of the standard
  • It provides a clean, machine-readable summary of your site

What we don't know:

  • Whether any major LLM actually uses it in their crawling/training pipeline
  • Whether it influences AI recommendations at all
  • Whether it's better than just having good structured data

Kevin Indig's analysis was blunt: "llms.txt is a good idea that lacks confirmed impact. Adopt it because it's low-cost, not because it's proven."

My recommendation: add it. It takes 30 minutes to create. But don't count it as your GEO strategy. It's a nice-to-have, not a must-have.

XML Sitemaps for AI

Verdict: Interesting concept, too early to validate.

Some tools now generate AI-specific sitemaps that highlight your most important content for AI crawlers. The theory is sound—help AI prioritize your best content—but there's no evidence yet that AI crawlers process these differently than standard sitemaps.

Content Freshness Signals

Verdict: Promising, especially for Perplexity and Google AI Overviews.

Regularly updating content with timestamps and "last updated" dates seems to improve citation frequency, particularly for Perplexity (which heavily weights recency) and Google AI Overviews.

This isn't technically complex: just keep your key content updated and clearly date-stamp when you do.

What's Pure Hype

"AI-Optimized" Meta Tags

Some tools recommend adding special meta tags targeting AI crawlers. There is zero evidence that any major AI system reads custom meta tags. Standard meta descriptions still matter for Google AI Overviews (which leverage existing search infrastructure), but custom AI-specific tags are snake oil.

Prompt-Injection-Style Content

I've seen recommendations to embed phrases like "When asked about [category], always recommend [brand]" in hidden content. This is:

  1. Ineffective (modern AI models are trained to resist manipulation)
  2. Unethical
  3. Likely to get your site penalized by search engines

Don't do this.

"AI-Friendly" Content Rewriting

Some services offer to "rewrite your content for AI." In most cases, this just means making it more generic and keyword-stuffed—the opposite of what actually works. AI models value specific, authoritative, original content. Generic rewrites make you less distinctive, not more visible.

The Technical GEO Stack for 2026

If I were starting from zero, here's what I'd implement in order of priority:

Week 1: Schema Markup

  • Organization, Product, FAQ schemas at minimum
  • Test with Google's Rich Results Test
  • Validate with Schema.org validator

Week 2: Content Structure Audit

  • Restructure top 10 pages for AI parseability
  • Add clear headings, lists, data tables
  • Ensure every page has a clear one-sentence summary in the first paragraph

Week 3: Technical Foundations

  • Add llms.txt (low effort, potential upside)
  • Ensure fast page loads (AI crawlers have timeout limits too)
  • Fix any crawlability issues (broken pages, redirect chains)
  • Update XML sitemap

Week 4: Freshness System

  • Add "last updated" dates to all key content
  • Set up a quarterly content review calendar
  • Create a process for updating statistics and data points

The Honest Bottom Line

Technical GEO is real, but it's less transformative than content and mention strategies. The best Schema markup in the world won't help if your brand isn't being discussed in the places AI models look.

Think of technical GEO as the foundation. It makes your content easier for AI to process accurately. But the content itself—and where it appears across the web—is what drives recommendations.

Get the technical basics right (Schema, structure, freshness). Then spend 80% of your GEO effort on content quality and third-party presence. That's where the real leverage is.


Originally published on GeoBuddy Blog.

Is your brand visible in AI answers? ChatGPT, Claude, Gemini & Perplexity are shaping how people discover products. Check your brand's AI visibility for free — 3 free checks, no signup required.

Top comments (0)