Robots.txt Generator — Free Online Tool

Generate a robots.txt file for your website in seconds. Add user-agent rules, allow or disallow paths, set your sitemap URL, and configure crawl delay. Copy the output and upload to your site root. Free, instant, no signup needed.

Sitemap URL

robots.txt

Ad Space

How Robots.txt Generator Works

Generate a robots.txt file for your website in seconds. Control search engine crawling, block bots, set sitemap URL. instant, copy-ready output. Customize your options in the form above and the tool generates your result instantly in your browser — ready to download, copy, or share.

How to Create a Robots.txt File

Select a preset or build custom rules by adding user-agent directives. For each bot (user-agent), specify which paths to allow or disallow. Add your sitemap URL so search engines can find your XML sitemap. Click "Generate" to get the complete robots.txt file — copy it and upload to your website's root directory (e.g., example.com/robots.txt). The file must be at the root level to work.

Why Robots.txt Matters for SEO

The robots.txt file tells search engine crawlers which pages they can and cannot access on your site. A properly configured robots.txt prevents indexing of duplicate content, admin pages, staging environments, and private areas. It also directs crawlers to your sitemap for efficient discovery of all your pages. Without it, bots may crawl and index pages you do not want in search results.

How to Block AI Crawlers in 2026

AI companies (OpenAI, Google, Anthropic, Meta, Apple) use web crawlers to scrape content for training data. This generator includes a "Block AI Crawlers" preset that blocks GPTBot, Google-Extended, ClaudeBot, FacebookBot, Applebot-Extended, CCBot, and other AI-specific user agents. This does not affect regular search indexing — Google's main crawler (Googlebot) remains allowed. Note that respecting robots.txt is voluntary — not all AI crawlers honor it.

Common Robots.txt Mistakes to Avoid

Do not block your CSS and JavaScript files — Google needs them to render your pages properly. Do not block your images folder if you want image search traffic. Always include a trailing slash for directories (e.g., /admin/ not /admin). Use Disallow: / carefully — it blocks the entire site for that user-agent. Test your robots.txt using Google Search Console's robots.txt tester before deploying.

Robots.txt for AI Crawlers (GPTBot, ClaudeBot, PerplexityBot) in 2026

In 2026, AI training crawlers have multiplied, and each one ships its own user-agent string. The big seven you should know are GPTBot (OpenAI training), OAI-SearchBot (ChatGPT search citations), ClaudeBot and anthropic-ai (Anthropic), PerplexityBot and Perplexity-User (Perplexity), Google-Extended (Gemini training, separate from Googlebot), and Applebot-Extended (Apple Intelligence). Block training but allow search citation by writing two separate blocks: User-agent: GPTBot followed by Disallow: /, then User-agent: OAI-SearchBot followed by Allow: /. If you want zero AI usage, block all of them. If you need a matching XML feed alongside the rules, our schema markup generator handles the structured-data side, and the meta tag generator covers noai / noimageai meta hints that some crawlers respect in addition to robots.txt. Pair it with a strong OG image preview check so anything that does get cited still looks correct in the AI answer card.

Common robots.txt Mistakes That Block Googlebot

According to Google Search Central, robots.txt is for crawl control only — it is not a reliable way to keep a URL out of the index, and a single typo can silently de-rank an entire site. The four mistakes we see most often in 2026: (1) shipping a staging file like User-agent: * + Disallow: / to production — instant traffic collapse; (2) blocking /wp-content/ or /assets/, which hides the CSS and JS Googlebot needs to render the page (fails Core Web Vitals); (3) using robots.txt to "noindex" a page (Google ignored that directive in 2019 — use a noindex meta tag or HTTP header, often paired with our .htaccess generator); (4) case-sensitivity slips like Disallow: /Admin/ when the real path is /admin/. Always validate in Search Console's robots.txt Tester after upload, and keep your developer hygiene tight with a clean .gitignore so secrets never reach the public root in the first place. If your robots.txt also references a legal page, generate one quickly with our privacy policy generator.