What is an llms.txt File?

3 min read to read

Learn about llms.txt files, a standard way to document AI models used in your projects, promoting transparency and trust in AI-powered applications.

What is an llms.txt File?

An llms.txt file is a simple text file that lists information about Large Language Models (LLMs) used in projects or applications. It is similar to other common files you might have seen, such as robots.txt, but its purpose is to clearly show details about the AI models involved.

Why Use an llms.txt File?

When you're building an app or a service that uses artificial intelligence, it is important to be clear about what AI technology is being used. The llms.txt file helps users, developers, and even regulators easily see:

  • Which LLMs (like ChatGPT, GPT-4, or any other model) your app uses.
  • The provider or company behind each AI model.
  • Basic details like version numbers, licenses, or usage guidelines.

This transparency makes your project more credible and trustworthy, as users can better understand the technology behind your software.

How Does It Work?

An llms.txt file is placed at the root of your domain, typically accessible through a URL like https://example.com/llms.txt. Users and developers can easily view this file to quickly understand the AI models your project uses. It serves as a standard way of providing clear, structured information about your AI resources.

Format of an llms.txt File

According to llmstxt.org, an llms.txt file uses Markdown format instead of traditional structured formats like XML or JSON. The reason is that Markdown is easy for both humans and language models to read.

The llms.txt Markdown file should include these specific sections, in this order:

  1. H1 Heading: This is the title and the only mandatory section.
  2. Blockquote: A brief description of the project, summarizing key points.
  3. Optional Detailed Sections: These can include paragraphs, lists, or other markdown content providing more details about the project.
  4. File Lists (optional): Defined by H2 headers, these sections contain lists of markdown links. Each link includes:
    • [Link Title](https://link_url) format, optionally followed by : Additional details.
  5. Optional Section: Marked explicitly as "Optional", containing secondary information that can be skipped when less context is needed.

Here's a basic example:

# Web crawling and data extraction | Webcrawlerapi

> This is a collection of pages from webcrawlerapi.com, formatted for language models.

## Available Pages

- [Web crawling and data extraction | Webcrawlerapi](https://webcrawlerapi.com)
- [Scrapers Marketplace | WebcrawlerAPI](https://webcrawlerapi.com/scrapers)
- [WebCrawling | WebcrawlerAPI docs](https://webcrawlerapi.com/docs/getting-started)
- [Blog | Webcrawlerapi](https://webcrawlerapi.com/blog)
- [Website to Markdown Free Tool | WebcrawlerAPI](https://webcrawlerapi.com/tools/website-to-md)

Origin of the llms.txt Format

The llms.txt format is inspired by other transparency-oriented files like robots.txt. It originated from a need for clearer disclosure about the use of AI models in software projects. The exact origin of the format isn't tied to a single company; rather, it emerged organically within the AI development community as companies and developers sought standard ways to communicate AI usage transparently and consistently.

Easily Generate Your Own llms.txt File

You don't have to manually create an llms.txt file from scratch. There are easy-to-use tools available online. For example, you can quickly and freely generate an llms.txt file using the free tool available at webcrawlerapi.com. This tool simplifies the process, automatically preparing a properly formatted text file for you to download and use immediately.

Who Should Use llms.txt?

Anyone who builds or hosts software using AI should consider adding an llms.txt file. It helps with transparency, building trust with your users, and clearly communicating the AI technology you depend on. Developers, companies, researchers, and even hobbyists can benefit from clearly listing their AI technologies.

Conclusion

An llms.txt file is a simple, transparent way to share important information about the AI models your website or application uses. It promotes trust, clarity, and openness, benefiting both creators and users of AI-powered technologies. With tools available online, setting up your own llms.txt file is quick and easy.