llms.txt vs robots.txt — what's the difference?
Both files live at the root of your domain. Both are plain text. Both talk to automated crawlers. But they serve completely different purposes — and confusing the two can leave your content either invisible or unprotected.
Paul Lovell
SEO Consultant
What robots.txt does
robots.txt has been around since 1994. It's a set of instructions for search engine crawlers — telling them which pages they're allowed to crawl and index, and which to stay away from.
A typical robots.txt might look like this:
User-agent: *
Disallow: /thank-you/
Disallow: /admin/
Allow: /
Sitemap: https://yoursite.com/sitemap.xml
The key point: robots.txt is about access control. It tells crawlers what they're not allowed to touch. It doesn't describe what your content is — it just sets boundaries.
What llms.txt does
llms.txt is a much newer concept. Where robots.txt is restrictive — telling crawlers what to avoid — llms.txt is inviting. It's a curated guide to your most important content, written specifically for AI language models.
Rather than controlling access, it provides context. An AI tool reading your llms.txt understands what your site is about, who it's for, and which pages are most relevant to reference when answering questions.
# Acme Corp
> B2B software for operations teams — product docs, blog, and resources.
## Pages
- [Product Overview](https://acme.com/product)
- [Pricing](https://acme.com/pricing)
## Blog Posts
- [Getting started with Acme](https://acme.com/blog/getting-started)
Key differences side by side
| robots.txt | llms.txt | |
|---|---|---|
| Purpose | Access control | Content discovery |
| Audience | Search engine crawlers | AI language models |
| Approach | Restrictive (what to avoid) | Inviting (what to read) |
| Around since | 1994 | 2024 |
| Industry standard | ✅ Widely supported | Growing adoption |
| Required? | Best practice | Optional but recommended |
| Content | Rules and directives | Links and descriptions |
Do they interact with each other?
Not directly — they're read by different systems. But there's an important consideration: if you block a page in robots.txt, it won't be indexed by search engines. If you then list that same page in llms.txt, AI tools might try to access it and find it blocked, or reference a URL that has no public content.
As a rule: only include pages in llms.txt that are publicly accessible and that you'd want search engines and AI tools to find. If a page is in your robots.txt Disallow list, leave it out of llms.txt too.
Do you need both?
Yes — they serve different purposes and neither replaces the other.
- robots.txt protects pages you don't want crawled (thank-you pages, admin areas, duplicate content)
- llms.txt promotes the content you do want AI tools to find and reference
- sitemap.xml (the third file in this family) gives search engines a complete URL inventory for crawling
Think of them as complementary: robots.txt sets the boundaries, sitemap.xml maps the territory, and llms.txt curates the highlights for AI.
What about HubSpot specifically?
HubSpot manages robots.txt automatically — you can customise it under Settings → Website → Domains & URLs → robots.txt, but HubSpot handles the fundamentals for you.
llms.txt is different. HubSpot has no native support for it, so you need to either create and host one manually, or use an app. We cover both options in detail here: How to add llms.txt to your HubSpot website →
Add llms.txt to HubSpot automatically
Our app generates and serves your llms.txt from your live HubSpot content — always up to date, zero maintenance.
Get started — $10/month →