Robots.txt Examples for Small Websites
Real robots.txt examples for WordPress, Shopify, custom sites, and more.
🤖 Quick tip: Use our Robots.txt Generator to create your file instantly.
What is robots.txt?
A robots.txt file tells search engine crawlers which pages or sections of your site to crawl or avoid. It's placed in your website's root directory (e.g., https://tyzo.in/robots.txt).
Example 1: Allow all crawlers (most common)
User-agent: * Allow: / Sitemap: https://yourwebsite.com/sitemap.xml
Use when: You want Google to index your entire public website.
Example 2: Block all crawlers (development/staging)
User-agent: * Disallow: /
Use when: Your site is in development and not ready for search engines.
Example 3: WordPress website
User-agent: * Disallow: /wp-admin/ Disallow: /wp-includes/ Disallow: /wp-content/plugins/ Disallow: /wp-content/themes/ Disallow: /wp-login.php Disallow: /xmlrpc.php Allow: /wp-content/uploads/ Sitemap: https://yourwebsite.com/sitemap.xml
Use when: You have a WordPress site and want to hide admin areas.
Example 4: Ecommerce website (Shopify)
User-agent: * Disallow: /admin/ Disallow: /cart/ Disallow: /checkout/ Disallow: /account/ Disallow: /search/ Allow: /products/ Allow: /collections/ Sitemap: https://yourwebsite.com/sitemap.xml
Use when: You want to prevent indexing of cart, checkout, and account pages.
Example 5: Block specific directories
User-agent: * Disallow: /private/ Disallow: /admin/ Disallow: /temp/ Allow: /public/ Sitemap: https://yourwebsite.com/sitemap.xml
Use when: You have specific folders you don't want indexed.
Example 6: Block specific file types
User-agent: * Disallow: /*.pdf$ Disallow: /*.zip$ Disallow: /*.mp4$ Sitemap: https://yourwebsite.com/sitemap.xml
Use when: You don't want search engines indexing your PDFs, ZIPs, or video files.
Example 7: Block specific crawlers (Googlebot, Bingbot)
User-agent: Googlebot Disallow: /private/ User-agent: Bingbot Disallow: /temp/ User-agent: * Allow: /
Use when: You want different rules for different search engines.
Example 8: Allow one directory, block everything else
User-agent: * Disallow: / Allow: /public/ Sitemap: https://yourwebsite.com/sitemap.xml
Use when: Only one folder should be indexed (e.g., a blog on a private site).
Example 9: Delay crawling (for large sites)
User-agent: * Crawl-delay: 10 Allow: / Sitemap: https://yourwebsite.com/sitemap.xml
Use when: Your server can't handle aggressive crawling. (Note: Googlebot ignores crawl-delay.)
Example 10: Custom CMS or static site
User-agent: * Disallow: /admin/ Disallow: /includes/ Disallow: /backup/ Allow: / Sitemap: https://yourwebsite.com/sitemap.xml
Use when: You have a custom-built site with admin or backup folders.
How to test your robots.txt
- ✅ Open Google Search Console → Settings → robots.txt Tester
- ✅ Enter your URL:
https://yourwebsite.com/robots.txt - ✅ Use the tester to see which pages are blocked
- ✅ Validate that important pages are NOT disallowed
Common mistakes to avoid
- ❌ Accidentally blocking your entire site —
Disallow: /on a live site - ❌ Missing trailing slashes —
Disallow: /adminvs/admin/ - ❌ Using robots.txt to hide sensitive data — It's public. Use password protection instead.
- ❌ Blocking CSS or JS files — Google needs these to render your page properly
- ❌ Not adding your sitemap — Helps Google find all your pages
Frequently asked questions
No. It only prevents crawling. Pages already indexed may stay. Use noindex meta tags to remove from index.
Not necessarily. If you don't have one, Google will crawl everything. It's only needed if you want to block specific sections.
Disallow = don't crawl the page. noindex = don't show in search results. For sensitive pages, use both or password protection.
Google can take hours to days to recrawl and respect new robots.txt rules.
Ready to generate your robots.txt?
Try Robots.txt Generator →