Skip navigation links

Crawler-commons 0.7 API

Packages 
Package Description
crawlercommons  
crawlercommons.domains
Classes contained within the domains package relate to the definition of Top Level Domain's, various domain registrars and the effective handling of such domains.
crawlercommons.fetcher
The main fetching package within Crawler Commons, this package defines base fetching and encoding classes, Enum's to determine reasoning behind typical fetching behaviour as well as the base Exceptions which may be used.
crawlercommons.fetcher.file
This package includes the SimpleFileFetcher code which extends the BaseFetcher.
crawlercommons.fetcher.http
This package concerns the fetching of files over the HTTP protocol: Extending from BaseHttpFetcher (which itself extends BaseFetcher) the SimpleHttpFetcher provides the Crawler Commons HTTP fetching implementation.
crawlercommons.filters
The filters package contains code and resources for URL filtering.
crawlercommons.filters.basic  
crawlercommons.robots
The robots package contains all of the robots.txt rule inference, parsing and utilities contained within Crawler Commons.
crawlercommons.sitemaps
Sitemaps package provides all classes relevant to focused sitemap parsing, url definition and processing.
Skip navigation links

Copyright © 2009–2016 Crawler-Commons. All rights reserved.