What is Robots.txt?
Robots.txt is a text file that tells search engine crawlers which pages or sections of your site they can or cannot access.
The robots.txt file sits at the root of your website (e.g., yoursite.com/robots.txt) and provides instructions to web crawlers about which parts of your site they should or shouldn't access.
Key things robots.txt can do:
- Block sensitive directories: Keep admin panels, private files, and staging areas out of search results
- Prevent duplicate content: Block parameter URLs, print pages, or filtered views
- Manage crawl budget: Guide crawlers to your most important content
- Block bad bots: Deny access to scrapers and resource-heavy crawlers
- Reference your sitemap: Help search engines discover all your pages
Important: Robots.txt is a guideline, not a security measure. Well-behaved bots follow it, but malicious bots may ignore it. Never rely on robots.txt to protect sensitive data.
Crawler Control
Direct search engine bots to your important pages while blocking irrelevant content
Crawl Budget Optimization
Help search engines spend their crawl budget on pages that matter
Bot Blocking
Block unwanted bots like scrapers and aggressive SEO tool crawlers
Sitemap Reference
Point crawlers directly to your sitemap for complete page discovery