llms.txt explained: what it is, why it matters, and how to get one
26 June 2026 ยท 7 min read
If you've been following GEO (Generative Engine Optimisation) you've probably encountered llms.txt. It's a relatively new convention — not yet a formal standard, but gaining rapid adoption — that gives AI engines a structured, readable index of your site's content. This guide explains what it is, how it works, and how to create one for your WordPress site.
What is llms.txt?
llms.txt is a plain text file placed at the root of your website (so it's accessible at yourdomain.com/llms.txt). It lists your key pages with titles and short descriptions, formatted in a way that language models can parse easily.
The format is simple Markdown:
# Your Site Name
Short description of what your site is about and who it's for.
## Key Pages
- [Page Title](https://yourdomain.com/page-url) โ Brief description of what this page covers
- [Another Page](https://yourdomain.com/another-page) โ Brief description
That's the core of it. The idea is borrowed conceptually from robots.txt — a simple, convention-based file that signals something useful to automated systems — but applied specifically to language models and AI crawlers.
Why does llms.txt matter?
AI engines have to make decisions about which content is relevant and authoritative for a given query. One of the ways they build context about a site is by crawling and indexing its content. But crawling is expensive and imperfect — crawlers may miss pages, misunderstand site structure, or fail to identify the most important content.
llms.txt gives AI crawlers a direct answer to “what is on this site and what is most important?” Instead of inferring your content map from links and sitemaps, an AI crawler can read your llms.txt and get a clear, curated picture in seconds.
For sites with substantial content libraries, this is valuable: you get to direct AI attention to your most important, most citeable pages. For sites with content that is hard to crawl (behind login, dynamically loaded, etc.), llms.txt provides an alternative discovery path.
What is llms-full.txt?
llms-full.txt is a companion file that goes beyond the index. Where llms.txt provides a page-level map, llms-full.txt provides the actual content of your pages in a clean, readable format — no HTML, no navigation, no ads. Just headings and prose.
This is particularly useful for:
- Sites where some content is behind authentication
- Sites where JavaScript rendering makes crawling difficult
- Sites that want to provide AI engines with a single, high-quality document to ingest
- Ensuring your exact content (not a garbled crawl interpretation) is what AI engines retrieve
Some AI systems are beginning to specifically look for llms-full.txt as a way to efficiently ingest complete site content without crawling every individual page.
Does llms.txt work? Is there evidence?
The honest answer is: there is correlational evidence but limited controlled study. The convention is relatively new and systematic research is sparse. What we can say:
- Several major sites and tools have adopted llms.txt, including Anthropic itself (Claude.ai provides an llms.txt)
- AI crawlers do fetch llms.txt when it exists, based on server log data from multiple sources
- The theoretical rationale is sound: giving LLMs a clean, structured content index should improve their ability to retrieve relevant content from your site
- The cost of implementation is low, making the risk/reward ratio favourable even without definitive proof
The convention was proposed by Jeremy Howard (fast.ai) in 2024 and has been adopted rapidly across the developer and AI community. Whether it becomes a formal standard remains to be seen, but directionally, it is moving in that direction.
How to create an llms.txt for WordPress
Manual approach
Create a file called llms.txt in your WordPress site root (the same directory as wp-config.php) with the following structure:
# Your Site Name
Brief description of your site.
## Key Pages
- [Home](https://yourdomain.com/) โ Overview of what you offer
- [About](https://yourdomain.com/about) โ Who you are and what you do
- [Most Important Post](https://yourdomain.com/key-post) โ What this covers
...
The limitation of this approach: it's a static file that needs manual updating every time you add important content. For large sites or frequent publishers, this is impractical.
Automated approach with ForGEO
ForGEO generates and maintains both llms.txt and llms-full.txt automatically. When you run an optimisation batch, ForGEO:
- Reads your posts and pages
- Generates concise, accurate descriptions for each
- Assembles them into a correctly-formatted llms.txt
- Writes the file to your site root
- Generates llms-full.txt with clean content from each post
Both files are regenerated automatically on subsequent runs, keeping them in sync with your content library without any manual maintenance.
Should you include everything in llms.txt?
Not necessarily. llms.txt is a curated index, not a sitemap. The guidance from the llms.txt convention is to include:
- Your most important and authoritative pages
- Pages that directly answer queries in your niche
- Key resources, guides, and tool pages
You don't need to include every blog post, tag page, or archive. The goal is to help AI engines understand what is most worth their attention on your site.
llms.txt and robots.txt: are they related?
Conceptually yes, practically no. Both are plain text files in your site root that provide instructions or information to automated systems. But they serve different purposes:
- robots.txt controls which crawlers can access which parts of your site
- llms.txt provides a content map for LLMs that have already been allowed to crawl
You need both. A robots.txt that blocks AI crawlers makes your llms.txt irrelevant. And a robots.txt that allows AI crawlers without an llms.txt misses the opportunity to guide their indexing.
Ready to make your WordPress content citation-ready?
14 GEO optimisations. Written directly. No copy-paste.
See what ForGEO optimises →