Free tools that run locally in your browser with zero data storage.
Tyzo

Robots.txt Generator & Editor

Create a professional robots.txt file to control how search engines crawl your website. Block unwanted bots, protect sensitive directories, and optimize crawl budget.

Click any preset to auto-fill the form below with recommended settings.

Use /directory/ to block entire directories, /page.html to block specific pages, or /*?* to block all URLs with parameters

Allow rules override Disallow rules for specific paths. Useful for allowing subdirectories within blocked directories.

Add your sitemap location to help search engines discover all your pages.

Sets delay between requests for the same user agent. Use 5-10 for large sites, 1-2 for small sites.

Understanding Robots.txt (Complete Guide)

Robots.txt is a text file placed in your website's root directory that tells search engine crawlers which pages or sections of your site to crawl or not crawl. It's the first file search engines check when visiting your site β€” before any other content.

Think of robots.txt as a "gatekeeper" for search engines. When Googlebot arrives at your site, it immediately looks for `https://example.com/robots.txt`. This file instructs the bot where it can and cannot go. Well-configured robots.txt files save crawl budget, protect sensitive content, and improve SEO performance.

Why Robots.txt Matters for SEO:
  • πŸ’° Crawl Budget Optimization: Search engines allocate limited crawl time to your site. Blocking low-value pages (search results, archives, admin) ensures important pages get crawled more frequently.
  • πŸ”’ Content Protection: Prevent indexing of staging environments, admin panels, login pages, and duplicate content. Note: robots.txt blocks crawling but not indexing if linked externally.
  • ⚑ Faster Indexing: By directing crawlers to your most important content, new pages get discovered and indexed faster.
  • 🚫 Block Spam Bots: Restrict malicious or resource-draining bots that waste server resources without providing SEO value.
  • πŸ—ΊοΈ Sitemap Discovery: Specify your sitemap location so crawlers find all your important pages even if not linked internally.
  • πŸ“Š Server Load Reduction: Blocking unnecessary crawling reduces server load, improving site performance for real users.
  • πŸ” Prevent Duplicate Content: Block URL parameters, printer-friendly versions, and other duplicate content sources that dilute SEO value.
Important Limitations:

Robots.txt is a directive, not an enforcement. Honest crawlers respect it, but malicious bots ignore it. For true content protection, use password protection or noindex meta tags. Also, if other sites link to blocked pages, Google may still index them without crawling content.

Robots.txt Syntax & Directives Guide

User-agent

Specifies which search engine crawler the rules apply to. Use "*" for all crawlers, or specific names like "Googlebot", "Bingbot", "YandexBot".

User-agent: *
Disallow

Blocks crawlers from accessing specific URLs or directories. Use "/directory/" to block entire folders, "/page.html" for specific pages.

Disallow: /admin/
Allow

Overrides Disallow rules for specific paths. Useful for allowing subdirectories within blocked directories.

Allow: /admin/public/
Sitemap

Specifies the location of your XML sitemap. Helps crawlers discover all your important pages.

Sitemap: https://example.com/sitemap.xml
Crawl-delay

Sets delay (in seconds) between successive requests from the same crawler. Reduces server load.

Crawl-delay: 5
Wildcard (*)

Matches any sequence of characters. Use "/*?*" to block all URLs with parameters, "/2023/*" to block specific year archives.

Disallow: /*?*
Example: Block Search Results Pages
User-agent: *
Disallow: /search/
Disallow: /*?s=
Disallow: /*?q=

This prevents search engines from crawling internal search result pages, which are typically low-value and generate infinite URLs.

12 Costly Robots.txt Mistakes

Mistake #1: Accidentally Blocking CSS/JS Files

Blocking CSS, JavaScript, or image files prevents Google from rendering your page correctly, harming mobile usability scores and rankings.

Mistake #2: Using Robots.txt for Sensitive Data

Robots.txt is public β€” anyone can view it. Never use it to hide sensitive information (passwords, personal data, payment pages). Use authentication instead.

Mistake #3: No Sitemap Reference

Without sitemap declaration, crawlers may miss important pages not linked internally. Always add Sitemap directive.

Mistake #4: Blocking All Crawlers on Production

Disallow: / on a live site removes your site from search results entirely. Only use during development or maintenance.

Mistake #5: Incorrect Syntax or Formatting

Missing colons, extra spaces, or incorrect capitalization breaks robots.txt. Use our generator to avoid syntax errors.

Mistake #6: Robots.txt in Wrong Location

Must be at `https://example.com/robots.txt` (root directory). Placing elsewhere makes itζ— ζ•ˆ.

Mistake #7: Blocking Valuable Resources

Accidentally blocking product pages, blog posts, or category pages destroys SEO. Review your disallow list carefully.

Mistake #8: Not Handling URL Parameters

Parameter URLs (?sort=asc, ?page=2) create infinite crawl space. Block them unless they contain unique content.

Mistake #9: Inconsistent User-Agent Rules

Different crawlers need different rules. Googlebot handles JavaScript; others don't. Configure per user-agent as needed.

Mistake #10: Not Testing After Changes

Always test robots.txt changes using Google Search Console's robots.txt tester. One typo can block your entire site.

Mistake #11: Setting Crawl-Delay Too Low

Crawl-delay: 0.1 may still overwhelm small servers. Start with 5-10 seconds for shared hosting, adjust based on server logs.

Mistake #12: Forgetting Multiple User-Agents

Rules for "*" don't automatically apply to specific bots. You must duplicate rules or specify each user-agent individually.

Platform-Specific Robots.txt Best Practices

WordPress

Block wp-admin, wp-includes, and plugin directories. Block /?s=* (search), /feed/, and /trackback/. Allow /wp-admin/admin-ajax.php for functionality.

Shopify

Shopify automatically generates robots.txt. You can only customize via theme.liquid. Block /collections/*/products/, /pages/*/comments.

Magento

Block /checkout/, /catalogsearch/, /customer/, /wishlist/, and parameter URLs. Block version-specific directories.

Custom/PHP Sites

Block /admin/, /includes/, /logs/, /temp/, /backup/, and /config/. Block script files (.php, .inc) unless necessary.

Frequently Asked Questions About Robots.txt

Does robots.txt prevent pages from being indexed?
No, robots.txt only prevents crawling, not indexing. If other sites link to a blocked page, Google may still index it without seeing the content, showing only the URL. For true no-indexing, use the noindex meta tag or X-Robots-Tag HTTP header. Use robots.txt to block crawling of low-value pages (search results, filters) and noindex for sensitive content.
How do I block Google from crawling my entire site?
Add `Disallow: /` under `User-agent: Googlebot`. This tells Google not to crawl any pages. Use only for staging sites or during maintenance. For live sites, this removes your site from search results entirely. Always test in Google Search Console first.
Is a sitemap directive required in robots.txt?
Not required but highly recommended. Adding `Sitemap: https://example.com/sitemap.xml` helps crawlers discover all your important pages, especially those with few internal links. It's supported by Google, Bing, Yandex, and most major search engines.
What crawl-delay should I use?
For shared hosting, use 5-10 seconds. For dedicated servers, 1-2 seconds. For high-traffic sites, monitor server logs and adjust. Google ignores crawl-delay β€” use Google Search Console's crawl rate setting instead. Crawl-delay works for Bing, Yandex, and other crawlers.
Can I block specific URL parameters?
Yes! Use `Disallow: /*?*` to block all parameter URLs, or `Disallow: /*?sort=` to block specific parameters. This prevents infinite URL generation from filters, sorts, and tracking parameters that waste crawl budget.
How do I test if my robots.txt is working?
Use Google Search Console's robots.txt Tester tool. It validates syntax and shows which URLs Googlebot can access. Also use Bing Webmaster Tools' robots.txt tester. Test after every change before deploying to production.
What's the difference between Googlebot and Googlebot-Image?
Googlebot crawls web pages. Googlebot-Image specifically crawls images. If you want images indexed but not pages, allow Googlebot-Image while blocking Googlebot. Useful for image-heavy sites with thin content.
Do I need a separate robots.txt for mobile?
No, the same robots.txt serves both desktop and mobile crawlers. Google uses Googlebot for desktop and Googlebot-Mobile for mobile-first indexing. Configure rules that apply to both or specify each user-agent separately.
Does this tool work on mobile devices?
Yes! The robots.txt generator is fully responsive and works on phones, tablets, and desktops. All configuration options are accessible, and the robots.txt file updates in real-time. Perfect for on-the-go SEO management.
Is this robots.txt generator really free?
Yes, completely free! No sign-up, no credit card, no hidden fees. No limits on how many robots.txt files you generate. We keep it free through non-intrusive advertising that respects your privacy. Your data never leaves your browser β€” we don't store or log anything. Use it for development, staging, or production sites.

Generate Your Robots.txt File Now

Free robots.txt generator for SEO professionals and webmasters. Control crawlers, save crawl budget, improve SEO.

Explore All SEO Tools