Standard
What is llms.txt?
/llms.txt) that gives AI engines a curated, machine-readable index of the site's most citation-worthy pages. Think of it as the inverse of robots.txt: where robots tells crawlers what to skip, llms.txt tells AI engines what to fetch first. PressGEO ships one with every site that uses the platform — you can see ours at pressgeo.com/llms.txt.Key facts
- Path: served at the root, /llms.txt — never anywhere else.
- Format: plain Markdown with H1 site name, optional summary, then H2 sections grouping curated links.
- Purpose: tells AI engines which pages are most citable, in priority order.
- Companion file: /llms-full.txt holds the full content concatenated for one-fetch ingestion.
- Adoption: early but growing — supported opportunistically by Anthropic, Perplexity, and several indexing pipelines as of mid-2026.
Why does llms.txt exist?
Sitemaps were built for search-engine crawlers. They list every URL on a site, often thousands, with no signal about which pages an AI engine should ingest first. AI engines have finite fetch budgets and a strong preference for high-signal, citation-shaped pages. llms.txt closes that gap with an explicit "start here" list, in priority order, written in Markdown so an LLM can parse it directly.
What does an llms.txt file look like?
The minimal shape is a Markdown file with an H1, an optional one-paragraph summary, and grouped H2 sections of links:
# PressGEO > PressGEO is an AI-citation press release platform. > It scores, distributes, and monitors releases for citations > across ChatGPT, Gemini, Perplexity, Claude, Copilot, and Grok. ## Definitions - [What is GEO?](https://pressgeo.com/what-is-geo): The canonical definition. - [GEO vs SEO](https://pressgeo.com/geo-vs-seo): Side-by-side comparison. ## Live data - [Proof](https://pressgeo.com/proof): Live citation rates by engine. - [Research](https://pressgeo.com/research): Weekly citation reports.
The companion /llms-full.txt file concatenates the full Markdown of every linked page so an engine can pull the content in one fetch instead of crawling each URL separately.
How is llms.txt different from robots.txt and sitemap.xml?
- robots.txt is an exclusion list — tells crawlers what NOT to fetch.
- sitemap.xml is an exhaustive URL list, machine-readable, designed for search engines to enumerate every page.
- llms.txt is a curated, prioritized inclusion list, written in Markdown, designed for AI engines that benefit from a "start here" hint rather than a full index.
All three coexist. Ship llms.txt in addition to your existing robots.txt and sitemap, not instead of them.
How do I ship one this week?
Write a Markdown file with your site name as the H1, a two-sentence summary, then H2 sections grouping your 10–20 most citation-worthy URLs (definitions, data pages, customer stories, FAQ pages). Save it as llms.txt at your web root. Verify it's served as text/plain by hitting yourdomain.com/llms.txt in a browser. Optional second step: build a concatenated llms-full.txt by inlining the Markdown bodies of each linked page.
PressGEO does this automatically for every site on the platform — view our own at /llms.txt and /llms-full.txt as reference shapes.
“llms.txt is the cheapest GEO tactic available right now. The file takes 20 minutes to write, costs nothing to serve, and gives early-adopter sites a visible head start in AI retrieval pipelines that are otherwise sitemap-blind.”
Frequently asked questions
- What is llms.txt?
- llms.txt is a proposed plain-text file served at the root of a site (/llms.txt) that gives AI engines a curated, machine-readable index of the site's most citation-worthy pages, with short descriptions and links to a longer /llms-full.txt that contains the actual content. It is to AI engines what robots.txt is to crawlers: a discovery hint.
- How is llms.txt different from robots.txt?
- robots.txt tells crawlers what NOT to fetch. llms.txt tells AI engines what to fetch FIRST. They serve opposite purposes: robots is an exclusion list; llms is an inclusion / prioritization list specifically aimed at AI training and retrieval pipelines.
- Do AI engines actually read llms.txt today?
- Adoption is early. As of mid-2026, several AI startups and indexing pipelines fetch llms.txt opportunistically, and a growing number of crawlers respect the prioritization hints. It is not yet a formal standard, but cost-of-shipping is near-zero and the upside is meaningful — early adopters surface in citation flows that ignore traditional sitemaps.
- Where should llms.txt live?
- Always at the site root: https://yourdomain.com/llms.txt. AI engines check the well-known root path; a file at any other path is invisible to them.
- What's the difference between llms.txt and llms-full.txt?
- llms.txt is a short index (titles, links, one-line descriptions). llms-full.txt is the full content — typically a concatenated Markdown dump of every page listed in llms.txt — so an AI engine can ingest the substantive content in one fetch.
Last updated: June 5, 2026 · By PressGEO Research