Package crawlercommons.robots
The robots package contains all of the robots.txt rule inference, parsing and utilities contained within Crawler Commons.
-
Class Summary Class Description BaseRobotRules Result from parsing a single robots.txt file - which means we get a set of rules, and a crawl-delay.BaseRobotsParser SimpleRobotRules Result from parsing a single robots.txt file - which means we get a set of rules, and an optional crawl-delay, and an optional sitemap URL.SimpleRobotRules.RobotRule Single rule that maps from a path prefix to an allow flag.SimpleRobotRulesParser This implementation ofBaseRobotsParser
retrieves a set ofrules
for an agent with the given name from therobots.txt
file of a given domain. -
Enum Summary Enum Description SimpleRobotRules.RobotRulesMode