Robotstxt
Robots.txt is a plain text file used by websites to guide web crawlers regarding which parts of a site may be crawled or indexed. It is part of the Robots Exclusion Protocol and is not a security mechanism; it relies on voluntary compliance by crawlers and should not be relied on to protect sensitive data.
The file is placed at the root of a domain, and most crawlers fetch it before indexing.
Other commonly used directives include Crawl-delay, which requests a delay between requests for some bots, though
Limitations and best practices: robots.txt only expresses preferences and does not prevent access; sensitive data should