Robots.txt Checker

Enter a website URL to view its robots.txt file and check whether the URL is blocked from crawling.

Enter URL

What is robots.txt?

The robots.txt file is a simple text file placed at the root of a website that tells web crawlers—like those used by Google and Bing—which parts of the site they can or cannot visit. It follows a standard protocol known as the Robots Exclusion Protocol, giving site owners control over how bots interact with their content.

How It Works

When a bot lands on your website, the first thing it looks for is the /robots.txt file. Based on the rules defined inside it, the bot will decide which URLs to crawl and which ones to avoid. These rules are written using two primary directives:

  • User-agent: Specifies the target crawler (e.g., * for all bots, Googlebot for Google).
  • Disallow/Allow: Directs bots to avoid or access certain paths.
User-agent: *
Disallow: /private/
Allow: /public/

Why It Matters for SEO

The robots.txt file plays a crucial role in how search engines interact with your site. A well-optimized file can help search engines focus on your most important pages while preventing them from indexing duplicate, irrelevant, or private content. It also helps manage your crawl budget by avoiding unnecessary indexing of unimportant sections.

How to Create and Use a robots.txt File

You can create a robots.txt file using any plain text editor. Ensure it’s encoded in UTF-8 and placed in the root of your domain—for example: https://www.yourdomain.com/robots.txt.

Inside the file, specify which bots you're targeting and which parts of your site they can access. Here's a basic example:

User-agent: *
Disallow: /admin/
Allow: /blog/
Sitemap: https://www.yourdomain.com/sitemap.xml

After creating the file, upload it to your web server. You can test and validate it using tools like Google’s Robots.txt Tester or other SEO analyzers.

Best Practices

  • Be precise with paths—use slashes correctly to distinguish between files and folders.
  • Don’t block critical assets like CSS or JavaScript files. Search engines need them to render your site correctly.
  • Review your file regularly to keep it aligned with your site structure and SEO goals.
  • Remember: robots.txt is publicly accessible. Avoid using it to hide sensitive data.

Using robots.txt with WordPress

For WordPress sites, robots.txt can be especially useful to manage crawler behavior and performance. A common setup is to block the /wp-admin/ area while still allowing AJAX requests for functionality.

User-agent: *
Disallow: /wp-admin/
Allow: /wp-admin/admin-ajax.php

Upload this file to your root directory and verify its behavior using a robots.txt testing tool. Keep in mind that any major theme, plugin, or content change might require updates to this file.