Package | Description |
---|---|
crawlercommons.sitemaps |
Sitemaps package provides all classes relevant to focused sitemap parsing,
url definition and processing.
|
crawlercommons.sitemaps.sax |
Modifier and Type | Class and Description |
---|---|
class |
SiteMap |
class |
SiteMapIndex |
Modifier and Type | Method and Description |
---|---|
AbstractSiteMap |
SiteMapIndex.getSitemap(URL url)
Returns the Sitemap that has the given URL.
|
AbstractSiteMap |
SiteMapIndex.nextUnprocessedSitemap() |
AbstractSiteMap |
SiteMapParser.parseSiteMap(byte[] content,
URL url)
Parse a sitemap, given the content bytes and the URL.
|
AbstractSiteMap |
SiteMapParser.parseSiteMap(String contentType,
byte[] content,
AbstractSiteMap sitemap)
Returns a processed copy of an unprocessed sitemap object, i.e.
|
AbstractSiteMap |
SiteMapParser.parseSiteMap(String contentType,
byte[] content,
URL url)
Parse a sitemap, given the MIME type, the content bytes, and the URL.
|
AbstractSiteMap |
SiteMapParser.parseSiteMap(URL onlineSitemapUrl)
Returns a SiteMap or SiteMapIndex given an online sitemap URL
Please note that this method is a static method which goes online and
fetches the sitemap then parses it
This method is a convenience method for a user who has a sitemap URL and
wants a "Keep it simple" way to parse it.
|
protected AbstractSiteMap |
SiteMapParser.processGzippedXML(URL url,
byte[] response)
Decompress the gzipped content and process the resulting XML Sitemap.
|
protected AbstractSiteMap |
SiteMapParser.processXml(URL sitemapUrl,
byte[] xmlContent)
Parse the given XML content.
|
protected AbstractSiteMap |
SiteMapParser.processXml(URL sitemapUrl,
InputSource is)
Parse the given XML content.
|
Modifier and Type | Method and Description |
---|---|
Collection<AbstractSiteMap> |
SiteMapIndex.getSitemaps() |
Collection<AbstractSiteMap> |
SiteMapIndex.getSitemaps(boolean deduplicate) |
Modifier and Type | Method and Description |
---|---|
void |
SiteMapIndex.addSitemap(AbstractSiteMap sitemap)
Add this Sitemap to the list of Sitemaps,
|
AbstractSiteMap |
SiteMapParser.parseSiteMap(String contentType,
byte[] content,
AbstractSiteMap sitemap)
Returns a processed copy of an unprocessed sitemap object, i.e.
|
void |
SiteMapParser.walkSiteMap(AbstractSiteMap sitemap,
java.util.function.Consumer<SiteMapURL> action)
Traverse a sitemap, recursively fetching and traversing the content of
any enclosed sitemap index, and performing the specified action for each
sitemap URL until all URLs have been processed or the action throws an
exception.
|
Modifier and Type | Method and Description |
---|---|
AbstractSiteMap |
DelegatorHandler.getSiteMap() |
Copyright © 2009–2021 Crawler-Commons. All rights reserved.