Package crawlercommons.robots
The robots package contains all of the robots.txt rule inference, parsing and utilities contained within Crawler Commons.
-
Class Summary Class Description BaseRobotRules Result from parsing a single robots.txt file – a set of allow/disallow rules to check whether a given URL is allowed, and optionally a Crawl-delay and Sitemap URLs.BaseRobotsParser Robots.txt parser definition.SimpleRobotRules Result from parsing a single robots.txt file – a set of allow/disallow rules to check whether a given URL is allowed, and optionally a Crawl-delay and Sitemap URLs.SimpleRobotRules.RobotRule Single rule that maps from a path prefix to an allow flag.SimpleRobotRulesParser Robots.txt parser following RFC 9309, supporting the Sitemap and Crawl-delay extensions. -
Enum Summary Enum Description SimpleRobotRules.RobotRulesMode