Package | Description |
---|---|
crawlercommons.robots |
The robots package contains all of the robots.txt rule inference, parsing and utilities contained within Crawler Commons.
|
Modifier and Type | Class and Description |
---|---|
class |
SimpleRobotRules
Result from parsing a single robots.txt file - which means we get a set of
rules, and a crawl-delay.
|
Modifier and Type | Method and Description |
---|---|
abstract BaseRobotRules |
BaseRobotsParser.failedFetch(int httpStatusCode)
The fetch of robots.txt failed, so return rules appropriate give the HTTP
status code.
|
BaseRobotRules |
SimpleRobotRulesParser.failedFetch(int httpStatusCode) |
static BaseRobotRules |
RobotUtils.getRobotRules(BaseHttpFetcher fetcher,
BaseRobotsParser parser,
URL robotsUrl)
Externally visible, static method for use in tools and for testing.
|
abstract BaseRobotRules |
BaseRobotsParser.parseContent(String url,
byte[] content,
String contentType,
String robotNames)
Parse the robots.txt file in content, and return rules appropriate
for processing paths by userAgent.
|
BaseRobotRules |
SimpleRobotRulesParser.parseContent(String url,
byte[] content,
String contentType,
String robotNames) |
Copyright © 2009–2016 Crawler-Commons. All rights reserved.