Package crawlercommons.robots
Class BaseRobotRules
- java.lang.Object
-
- crawlercommons.robots.BaseRobotRules
-
- All Implemented Interfaces:
Serializable
- Direct Known Subclasses:
SimpleRobotRules
public abstract class BaseRobotRules extends Object implements Serializable
Result from parsing a single robots.txt file - which means we get a set of rules, and a crawl-delay.- See Also:
- Serialized Form
-
-
Field Summary
Fields Modifier and Type Field Description static long
UNSET_CRAWL_DELAY
-
Constructor Summary
Constructors Constructor Description BaseRobotRules()
-
Method Summary
All Methods Instance Methods Abstract Methods Concrete Methods Modifier and Type Method Description void
addSitemap(String sitemap)
Add sitemap URL to rules if not a duplicateboolean
equals(Object obj)
long
getCrawlDelay()
List<String>
getSitemaps()
Get URLs of sitemap links found in robots.txtint
hashCode()
abstract boolean
isAllowAll()
abstract boolean
isAllowed(String url)
abstract boolean
isAllowNone()
boolean
isDeferVisits()
void
setCrawlDelay(long crawlDelay)
void
setDeferVisits(boolean deferVisits)
String
toString()
Returns a string with the crawl delay as well as a list of sitemaps if they exist (and aren't more than 10)
-
-
-
Field Detail
-
UNSET_CRAWL_DELAY
public static final long UNSET_CRAWL_DELAY
- See Also:
- Constant Field Values
-
-
Method Detail
-
isAllowed
public abstract boolean isAllowed(String url)
-
isAllowAll
public abstract boolean isAllowAll()
-
isAllowNone
public abstract boolean isAllowNone()
-
getCrawlDelay
public long getCrawlDelay()
-
setCrawlDelay
public void setCrawlDelay(long crawlDelay)
-
isDeferVisits
public boolean isDeferVisits()
-
setDeferVisits
public void setDeferVisits(boolean deferVisits)
-
addSitemap
public void addSitemap(String sitemap)
Add sitemap URL to rules if not a duplicate
-
-