llms.txt21 May 20264 min read

llms.txt vs robots.txt — what's the difference?

Both files live at the root of your domain. Both are plain text. Both talk to automated crawlers. But they serve completely different purposes — and confusing the two can leave your content either invisible or unprotected.

Paul Lovell

SEO Consultant

What robots.txt does

robots.txt has been around since 1994. It's a set of instructions for search engine crawlers — telling them which pages they're allowed to crawl and index, and which to stay away from.

A typical robots.txt might look like this:

User-agent: *

Disallow: /thank-you/

Disallow: /admin/

Allow: /

Sitemap: https://yoursite.com/sitemap.xml

The key point: robots.txt is about access control. It tells crawlers what they're not allowed to touch. It doesn't describe what your content is — it just sets boundaries.

What llms.txt does

llms.txt is a much newer concept. Where robots.txt is restrictive — telling crawlers what to avoid — llms.txt is inviting. It's a curated guide to your most important content, written specifically for AI language models.

Rather than controlling access, it provides context. An AI tool reading your llms.txt understands what your site is about, who it's for, and which pages are most relevant to reference when answering questions.

# Acme Corp

> B2B software for operations teams — product docs, blog, and resources.

## Pages

- [Product Overview](https://acme.com/product)

- [Pricing](https://acme.com/pricing)

## Blog Posts

- [Getting started with Acme](https://acme.com/blog/getting-started)

Key differences side by side

	robots.txt	llms.txt
Purpose	Access control	Content discovery
Audience	Search engine crawlers	AI language models
Approach	Restrictive (what to avoid)	Inviting (what to read)
Around since	1994	2024
Industry standard	✅ Widely supported	Growing adoption
Required?	Best practice	Optional but recommended
Content	Rules and directives	Links and descriptions

Do they interact with each other?

Not directly — they're read by different systems. But there's an important consideration: if you block a page in robots.txt, it won't be indexed by search engines. If you then list that same page in llms.txt, AI tools might try to access it and find it blocked, or reference a URL that has no public content.

As a rule: only include pages in llms.txt that are publicly accessible and that you'd want search engines and AI tools to find. If a page is in your robots.txt Disallow list, leave it out of llms.txt too.

Do you need both?

Yes — they serve different purposes and neither replaces the other.

robots.txt protects pages you don't want crawled (thank-you pages, admin areas, duplicate content)
llms.txt promotes the content you do want AI tools to find and reference
sitemap.xml (the third file in this family) gives search engines a complete URL inventory for crawling

Think of them as complementary: robots.txt sets the boundaries, sitemap.xml maps the territory, and llms.txt curates the highlights for AI.

What about HubSpot specifically?

HubSpot manages robots.txt automatically — you can customise it under Settings → Website → Domains & URLs → robots.txt, but HubSpot handles the fundamentals for you.

llms.txt is different. HubSpot has no native support for it, so you need to either create and host one manually, or use an app. We cover both options in detail here: How to add llms.txt to your HubSpot website →

Add llms.txt to HubSpot automatically

Our app generates and serves your llms.txt from your live HubSpot content — always up to date, zero maintenance.

See how llms.txt works →

← Back to blog