{
  "@context": "https://agentflare.org/schema",
  "type": "Article",
  "tier": "L2-full",
  "title": "llms.txt: The Standard for AI-Readable Sites",
  "description": "llms.txt is a proposed, non-binding convention for giving AI crawlers and agents a curated, human-written map of a site’s most important content, typically published at…",
  "canonical": "https://agentflare.org/research/llmstxt-the-standard-for-ai-readable-sites.html",
  "category": "research",
  "updated": "2026-06-15",
  "generated_at": "2026-06-15T01:19:16.018Z",
  "facts": [
    {
      "label": "Topic",
      "value": "discovery"
    },
    {
      "label": "Sources",
      "value": "10"
    },
    {
      "label": "Updated",
      "value": "2026-06-15"
    }
  ],
  "data": {
    "topic": "llms.txt standard for AI crawlers",
    "cluster": "discovery",
    "summary": "llms.txt is a proposed, non-binding convention for giving AI crawlers and agents a curated, human-written map of a site’s most important content, typically published at…"
  },
  "analysis_md": "`llms.txt` is a **proposed, non-binding convention** for giving AI crawlers and agents a curated, human-written map of a site’s most important content, typically published at `/llms.txt`. It is *not* a blocking standard like `robots.txt`, and current evidence suggests its main value is improving retrieval precision and context for agentic tools, not changing search ranking or guaranteeing crawler behavior.[2][4][6]\n\n## What `llms.txt` is\n\nThe idea was proposed by Jeremy Howard in 2024 as a Markdown-based file at the site root that summarizes what a site is about and points to the pages an AI system should prioritize.[1][2][6] In practice, implementations describe it as a concise, curated index of canonical resources, often with short annotations for each link, so models and agents can find the right docs faster than by parsing complex HTML.[1][2][4]\n\nA typical file includes:\n- a site title or H1\n- a short description of the site\n- grouped links to key pages\n- brief descriptions or notes for each link[6]\n\n## What it does and does not do\n\n`llms.txt` is best understood as a **signal file**, not an access-control mechanism.[4][6] It can help compliant AI agents discover the right pages, but it does not force adherence, restrict unauthorized access, or replace `robots.txt`, auth, or paywalls.[4][6]\n\nIt also does **not** appear to be used by major web search ranking systems, and claims of large hallucination reduction or traffic gains should be treated cautiously unless backed by site-specific measurements.[6] For developers, the practical test is whether AI agents that support the convention fetch it and use it to improve page selection and context assembly.[6][7]\n\n## How AI agents use it\n\nFor AI agents, `llms.txt` functions like a **front door**: instead of starting from a broad crawl, an agent can first read a curated set of pages that reflect the publisher’s preferred interpretation of the site.[4][6] That is especially useful for documentation-heavy sites, SaaS products, APIs, and enterprise knowledge bases where canonical pages matter more than exhaustive crawling.[6][7]\n\nSome implementations also pair `llms.txt` with `llms-full.txt`, a broader Markdown bundle intended for agents that want more complete ingest in one request.[6][7] This is most relevant when the downstream use case is agentic retrieval, code generation, or support workflows rather than classic search indexing.[6][7]\n\n## Where HTTP 402 and pay-per-crawl fit\n\n`llms.txt` is about **discovery and preference**; HTTP 402 and pay-per-crawl are about **economic access control**. In a pay-per-crawl model, a crawler may need to authenticate, negotiate payment, or otherwise satisfy server-side policy before content is served, whereas `llms.txt` can only point the agent toward the content and describe preferred usage.[2][4][6]\n\nSo the two concepts are complementary: `llms.txt` can advertise the best entry points and licensing/usage notes, while HTTP 402-style gating can enforce monetization or access terms at the transport layer. If you need hard control, billing, or entitlement checks, rely on server enforcement rather than the text file alone.\n\n## Key takeaways\n\n- `llms.txt` is a **curated, root-level Markdown convention** for helping AI agents find a site’s most important content.[1][2][6]\n- It is **advisory, not enforceable**; it does not block crawlers or replace `robots.txt`, auth, or paywalls.[4][6]\n- Its strongest use case is **agentic retrieval** for docs, APIs, and other structured sites where canonical pages matter.[6][7]\n- **HTTP 402/pay-per-crawl** addresses access and monetization, while `llms.txt` addresses discoverability and guidance.[2][6]",
  "sources": [
    {
      "url": "https://www.linkbuildinghq.com/blog/should-websites-implement-llms-txt-in-2026/"
    },
    {
      "url": "https://www.bluehost.com/blog/what-is-llms-txt/"
    },
    {
      "url": "https://webscraft.org/blog/llmstxt-povniy-gayd-dlya-vebrozrobnikiv-2026?lang=en"
    },
    {
      "url": "https://similar.ai/guides/llms-txt/"
    },
    {
      "url": "https://getmint.ai/resources/llms-txt"
    },
    {
      "url": "https://derivatex.agency/blog/llms-txt-guide/"
    },
    {
      "url": "https://limy.ai/blog/llms.txt-in-2026-the-full-guide"
    },
    {
      "title": "LLMs.txt: Control How AI Crawlers Use Your Content | Royal Plugins",
      "url": "https://royalplugins.com/blog/llms-txt-ai-crawler-control"
    },
    {
      "title": "What Is LLMs.txt? & Do You Need One?",
      "url": "https://neilpatel.com/blog/llms-txt-files-for-seo"
    },
    {
      "title": "Will llms.txt become standardised/ official? If not, could ... - Quora",
      "url": "https://www.quora.com/Will-llms-txt-become-standardised-official-If-not-could-something-similar-appear"
    }
  ],
  "related": [
    {
      "name": "Generative Engine Optimization (GEO): A Primer",
      "url": "https://agentflare.org/research/generative-engine-optimization-geo-a-primer.html"
    },
    {
      "name": "Pay-Per-Crawl and the New Bot Web",
      "url": "https://agentflare.org/research/pay-per-crawl-and-the-new-bot-web.html"
    },
    {
      "name": "HTTP 402 & x402: How AI Agents Pay for Content",
      "url": "https://agentflare.org/research/http-402-x402-how-ai-agents-pay-for-content.html"
    },
    {
      "name": "The AI Agent Economy in 2026",
      "url": "https://agentflare.org/research/the-ai-agent-economy-in-2026.html"
    },
    {
      "name": "Model Context Protocol (MCP) Explained",
      "url": "https://agentflare.org/research/model-context-protocol-mcp-explained.html"
    },
    {
      "name": "Stablecoins as Rails for Autonomous Agents",
      "url": "https://agentflare.org/research/stablecoins-as-rails-for-autonomous-agents.html"
    }
  ],
  "pricing": {
    "price_usd": 0.02,
    "method": "402",
    "endpoint": "https://cdn.aisa.one/api/v1/access/verify",
    "autopay_hint": "set crawlerAutoPrice=true with X-AISA-Crawler-Token",
    "onboarding": "https://cdn.aisa.one/cdn/guide.html"
  },
  "powered_by": "AISA — agent-native search, settlement & delivery (https://aisa.one)"
}