Glossary
robots.txt
Text file at site root that gives crawl directives to user-agents.
In depth
Allow/disallow paths per user-agent. Cannot enforce noindex (blocking still allows indexing of the URL itself). The 2024 spec also covers AI crawlers (GPTBot, ClaudeBot, PerplexityBot).