- GALLERY_LOC - Static variable in class crawlercommons.sitemaps.extension.VideoAttributes
-
- GALLERY_TITLE - Static variable in class crawlercommons.sitemaps.extension.VideoAttributes
-
- GENRES - Static variable in class crawlercommons.sitemaps.extension.NewsAttributes
-
- GEO_LOCATION - Static variable in class crawlercommons.sitemaps.extension.ImageAttributes
-
- get(String) - Method in class crawlercommons.domains.SuffixTrie
-
Get value associated with suffix string in trie.
- getAllowedCountries() - Method in class crawlercommons.sitemaps.extension.VideoAttributes
-
- getAllowedPlatforms() - Method in class crawlercommons.sitemaps.extension.VideoAttributes
-
- getAndResetCharacterBuffer() - Method in class crawlercommons.sitemaps.sax.DelegatorHandler
-
- getAssignedDomain(String) - Static method in class crawlercommons.domains.EffectiveTldFinder
-
This method uses the effective TLD to determine which component of a FQDN
is the NIC-assigned domain name (aka "Paid Level Domain").
- getAssignedDomain(String, boolean) - Static method in class crawlercommons.domains.EffectiveTldFinder
-
This method uses the effective TLD to determine which component of a FQDN
is the NIC-assigned domain name (aka "Paid Level Domain").
- getAssignedDomain(String, boolean, boolean) - Static method in class crawlercommons.domains.EffectiveTldFinder
-
This method uses the effective TLD to determine which component of a FQDN
is the NIC-assigned domain name.
- getAttributes() - Method in class crawlercommons.sitemaps.sax.extension.ExtensionHandler
-
- getAttributes() - Method in class crawlercommons.sitemaps.sax.extension.ImageHandler
-
- getAttributes() - Method in class crawlercommons.sitemaps.sax.extension.MobileHandler
-
- getAttributes() - Method in class crawlercommons.sitemaps.sax.extension.NewsHandler
-
- getAttributes() - Method in class crawlercommons.sitemaps.sax.extension.VideoHandler
-
- getAttributes() - Method in class crawlercommons.sitemaps.SiteMapURL
-
Get attributes of sitemap extensions (news, images, videos, etc.)
- getAttributesForExtension(Extension) - Method in class crawlercommons.sitemaps.SiteMapURL
-
Get attributes of a specific sitemap extension
- getBaseUrl() - Method in class crawlercommons.sitemaps.SiteMap
-
- getCaption() - Method in class crawlercommons.sitemaps.extension.ImageAttributes
-
- getCategory() - Method in class crawlercommons.sitemaps.extension.VideoAttributes
-
- getChangeFrequency() - Method in class crawlercommons.sitemaps.SiteMapURL
-
Return the URL's change frequency
- getChild(char) - Method in class crawlercommons.domains.SuffixTrie.Node
-
- getContentLoc() - Method in class crawlercommons.sitemaps.extension.VideoAttributes
-
- getCrawlDelay() - Method in class crawlercommons.robots.BaseRobotRules
-
- getCurrency() - Method in class crawlercommons.sitemaps.extension.VideoAttributes.VideoPrice
-
- getDateValue(String) - Static method in class crawlercommons.sitemaps.sax.extension.ExtensionHandler
-
- getDescription() - Method in class crawlercommons.sitemaps.extension.VideoAttributes
-
- getDomain() - Method in class crawlercommons.domains.EffectiveTldFinder.EffectiveTLD
-
- getDuration() - Method in class crawlercommons.sitemaps.extension.VideoAttributes
-
- getEffectiveTLD(String) - Static method in class crawlercommons.domains.EffectiveTldFinder
-
Get EffectiveTLD for host name using the singleton instance of
EffectiveTldFinder.
- getEffectiveTLD(String, boolean) - Static method in class crawlercommons.domains.EffectiveTldFinder
-
Get EffectiveTLD for host name using the singleton instance of
EffectiveTldFinder.
- getEffectiveTLDs() - Static method in class crawlercommons.domains.EffectiveTldFinder
-
- getException() - Method in class crawlercommons.sitemaps.sax.DelegatorHandler
-
- getExpirationDate() - Method in class crawlercommons.sitemaps.extension.VideoAttributes
-
- getExpirationDateTime() - Method in class crawlercommons.sitemaps.extension.VideoAttributes
-
- getFamilyFriendly() - Method in class crawlercommons.sitemaps.extension.VideoAttributes
-
- getFloatValue(String) - Static method in class crawlercommons.sitemaps.sax.extension.ExtensionHandler
-
- getGalleryLoc() - Method in class crawlercommons.sitemaps.extension.VideoAttributes
-
- getGalleryTitle() - Method in class crawlercommons.sitemaps.extension.VideoAttributes
-
- getGenres() - Method in class crawlercommons.sitemaps.extension.NewsAttributes
-
- getGeoLocation() - Method in class crawlercommons.sitemaps.extension.ImageAttributes
-
- getHref() - Method in class crawlercommons.sitemaps.extension.LinkAttributes
-
- getInstance() - Static method in class crawlercommons.domains.EffectiveTldFinder
-
Get singleton instance of EffectiveTldFinder with default configuration.
- getIntegerValue(String) - Static method in class crawlercommons.sitemaps.sax.extension.ExtensionHandler
-
- getKeywords() - Method in class crawlercommons.sitemaps.extension.NewsAttributes
-
- getLanguage() - Method in class crawlercommons.sitemaps.extension.NewsAttributes
-
- getLastModified() - Method in class crawlercommons.sitemaps.AbstractSiteMap
-
- getLastModified() - Method in class crawlercommons.sitemaps.SiteMapURL
-
Return when this URL was last modified.
- getLicense() - Method in class crawlercommons.sitemaps.extension.ImageAttributes
-
- getLive() - Method in class crawlercommons.sitemaps.extension.VideoAttributes
-
- getLoc() - Method in class crawlercommons.sitemaps.extension.ImageAttributes
-
- getLongestSuffix(String) - Method in class crawlercommons.domains.SuffixTrie
-
Match the longest suffix of a string contained in trie.
- getMaxCrawlDelay() - Method in class crawlercommons.robots.SimpleRobotRulesParser
-
Get configured max crawl delay.
- getMaxWarnings() - Method in class crawlercommons.robots.SimpleRobotRulesParser
-
Get max number of logged warnings per robots.txt
- getName() - Method in class crawlercommons.sitemaps.extension.NewsAttributes
-
- getNameVariants() - Method in class crawlercommons.domains.EffectiveTldFinder.EffectiveTLD
-
Generate name variants caused by Internationalized Domain Names:
every IDN part of a eTLD can be replaced by its punycoded ASCII
variant.
- getNumWarnings() - Method in class crawlercommons.robots.SimpleRobotRulesParser
-
- getParams() - Method in class crawlercommons.sitemaps.extension.LinkAttributes
-
- getPlayerLoc() - Method in class crawlercommons.sitemaps.extension.VideoAttributes
-
- getPLD(String) - Static method in class crawlercommons.domains.PaidLevelDomain
-
Extract the PLD (paid-level domain) from the hostname.
- getPLD(URL) - Static method in class crawlercommons.domains.PaidLevelDomain
-
Extract the PLD (paid-level domain) from the URL.
- getPrefix() - Method in class crawlercommons.robots.SimpleRobotRules.RobotRule
-
- getPrice() - Method in class crawlercommons.sitemaps.extension.VideoAttributes.VideoPrice
-
- getPrices() - Method in class crawlercommons.sitemaps.extension.VideoAttributes
-
- getPriority() - Method in class crawlercommons.sitemaps.SiteMapURL
-
Return this URL's priority (a value between [0.0 - 1.0]).
- getPublicationDate() - Method in class crawlercommons.sitemaps.extension.NewsAttributes
-
- getPublicationDate() - Method in class crawlercommons.sitemaps.extension.VideoAttributes
-
- getPublicationDateTime() - Method in class crawlercommons.sitemaps.extension.NewsAttributes
-
- getPublicationDateTime() - Method in class crawlercommons.sitemaps.extension.VideoAttributes
-
- getRating() - Method in class crawlercommons.sitemaps.extension.VideoAttributes
-
- getRequiresSubscription() - Method in class crawlercommons.sitemaps.extension.VideoAttributes
-
- getResolution() - Method in class crawlercommons.sitemaps.extension.VideoAttributes.VideoPrice
-
- getRestrictedCountries() - Method in class crawlercommons.sitemaps.extension.VideoAttributes
-
- getRestrictedPlatforms() - Method in class crawlercommons.sitemaps.extension.VideoAttributes
-
- getRobotRules() - Method in class crawlercommons.robots.SimpleRobotRules
-
- getSiteMap() - Method in class crawlercommons.sitemaps.sax.DelegatorHandler
-
- getSitemap(URL) - Method in class crawlercommons.sitemaps.SiteMapIndex
-
Returns the Sitemap that has the given URL.
- getSitemaps() - Method in class crawlercommons.robots.BaseRobotRules
-
Get URLs of sitemap links found in robots.txt
- getSitemaps() - Method in class crawlercommons.sitemaps.SiteMapIndex
-
- getSitemaps(boolean) - Method in class crawlercommons.sitemaps.SiteMapIndex
-
- getSiteMapUrls() - Method in class crawlercommons.sitemaps.SiteMap
-
- getStockTickers() - Method in class crawlercommons.sitemaps.extension.NewsAttributes
-
- getSuffixes(String) - Method in class crawlercommons.domains.SuffixTrie
-
Match all suffixes of a string contained in trie.
- getTags() - Method in class crawlercommons.sitemaps.extension.VideoAttributes
-
- getThumbnailLoc() - Method in class crawlercommons.sitemaps.extension.VideoAttributes
-
- getTitle() - Method in class crawlercommons.sitemaps.extension.ImageAttributes
-
- getTitle() - Method in class crawlercommons.sitemaps.extension.NewsAttributes
-
- getTitle() - Method in class crawlercommons.sitemaps.extension.VideoAttributes
-
- getType() - Method in class crawlercommons.sitemaps.AbstractSiteMap
-
- getType() - Method in class crawlercommons.sitemaps.extension.VideoAttributes.VideoPrice
-
- getUploader() - Method in class crawlercommons.sitemaps.extension.VideoAttributes
-
- getUploaderInfo() - Method in class crawlercommons.sitemaps.extension.VideoAttributes
-
- getUrl() - Method in class crawlercommons.sitemaps.AbstractSiteMap
-
- getUrl() - Method in class crawlercommons.sitemaps.sax.DelegatorHandler
-
- getUrl() - Method in class crawlercommons.sitemaps.SiteMapURL
-
Return the URL.
- getURLValue(String) - Static method in class crawlercommons.sitemaps.sax.extension.ExtensionHandler
-
- getVersion() - Static method in class crawlercommons.CrawlerCommons
-
- getViewCount() - Method in class crawlercommons.sitemaps.extension.VideoAttributes
-
- getYesNoBooleanValue(String, String) - Static method in class crawlercommons.sitemaps.sax.extension.ExtensionHandler
-
- PaidLevelDomain - Class in crawlercommons.domains
-
Routines to extract the PLD (paid-level domain, as per the IRLbot paper) from
a hostname or URL.
- PaidLevelDomain() - Constructor for class crawlercommons.domains.PaidLevelDomain
-
- parseContent(String, byte[], String, String) - Method in class crawlercommons.robots.BaseRobotsParser
-
Parse the robots.txt file in content, and return rules appropriate
for processing paths by userAgent.
- parseContent(String, byte[], String, String) - Method in class crawlercommons.robots.SimpleRobotRulesParser
-
- parseQueryParameters(String, int, Set<String>) - Static method in class crawlercommons.filters.basic.BasicURLNormalizer
-
Receives the URL query string and parses it into a list of name-value pairs.
- parseRSSTimestamp(String) - Static method in class crawlercommons.sitemaps.AbstractSiteMap
-
Parse pubDate of RSS feeds.
- parseSiteMap(URL) - Method in class crawlercommons.sitemaps.SiteMapParser
-
Returns a SiteMap or SiteMapIndex given an online sitemap URL
Please note that this method is a static method which goes online and
fetches the sitemap then parses it
This method is a convenience method for a user who has a sitemap URL and
wants a "Keep it simple" way to parse it.
- parseSiteMap(String, byte[], AbstractSiteMap) - Method in class crawlercommons.sitemaps.SiteMapParser
-
Returns a processed copy of an unprocessed sitemap object, i.e.
- parseSiteMap(byte[], URL) - Method in class crawlercommons.sitemaps.SiteMapParser
-
Parse a sitemap, given the content bytes and the URL.
- parseSiteMap(String, byte[], URL) - Method in class crawlercommons.sitemaps.SiteMapParser
-
Parse a sitemap, given the MIME type, the content bytes, and the URL.
- PLAYER_LOC - Static variable in class crawlercommons.sitemaps.extension.VideoAttributes
-
- PRICES - Static variable in class crawlercommons.sitemaps.extension.VideoAttributes
-
- processGzippedXML(URL, byte[]) - Method in class crawlercommons.sitemaps.SiteMapParser
-
Decompress the gzipped content and process the resulting XML Sitemap.
- processText(URL, byte[]) - Method in class crawlercommons.sitemaps.SiteMapParser
-
Process a text-based Sitemap.
- processText(URL, InputStream) - Method in class crawlercommons.sitemaps.SiteMapParser
-
Process a text-based Sitemap.
- processXml(URL, byte[]) - Method in class crawlercommons.sitemaps.SiteMapParser
-
Parse the given XML content.
- processXml(URL, InputSource) - Method in class crawlercommons.sitemaps.SiteMapParser
-
Parse the given XML content.
- PUBLICATION_DATE - Static variable in class crawlercommons.sitemaps.extension.NewsAttributes
-
- PUBLICATION_DATE - Static variable in class crawlercommons.sitemaps.extension.VideoAttributes
-
- put(String, V) - Method in class crawlercommons.domains.SuffixTrie
-
Insert a string and an associated value into the trie.
- setAcceptedNamespaces(Set<String>) - Method in class crawlercommons.sitemaps.sax.DelegatorHandler
-
- setAllowedCountries(String[]) - Method in class crawlercommons.sitemaps.extension.VideoAttributes
-
- setAllowedPlatforms(String[]) - Method in class crawlercommons.sitemaps.extension.VideoAttributes
-
- setCaption(String) - Method in class crawlercommons.sitemaps.extension.ImageAttributes
-
- setCategory(String) - Method in class crawlercommons.sitemaps.extension.VideoAttributes
-
- setChangeFrequency(SiteMapURL.ChangeFrequency) - Method in class crawlercommons.sitemaps.SiteMapURL
-
Set the URL's change frequency
- setChangeFrequency(String) - Method in class crawlercommons.sitemaps.SiteMapURL
-
Set the URL's change frequency In case of a bad ChangeFrequency, the
current frequency in this instance will be set to NULL
- setContentLoc(URL) - Method in class crawlercommons.sitemaps.extension.VideoAttributes
-
- setCrawlDelay(long) - Method in class crawlercommons.robots.BaseRobotRules
-
- setDeferVisits(boolean) - Method in class crawlercommons.robots.BaseRobotRules
-
- setDescription(String) - Method in class crawlercommons.sitemaps.extension.VideoAttributes
-
- setDuration(Integer) - Method in class crawlercommons.sitemaps.extension.VideoAttributes
-
- setException(UnknownFormatException) - Method in class crawlercommons.sitemaps.sax.DelegatorHandler
-
- setExpirationDate(ZonedDateTime) - Method in class crawlercommons.sitemaps.extension.VideoAttributes
-
- setExtensionNamespaces(Map<String, Extension>) - Method in class crawlercommons.sitemaps.sax.DelegatorHandler
-
- setFamilyFriendly(Boolean) - Method in class crawlercommons.sitemaps.extension.VideoAttributes
-
- setGalleryLoc(URL) - Method in class crawlercommons.sitemaps.extension.VideoAttributes
-
- setGalleryTitle(String) - Method in class crawlercommons.sitemaps.extension.VideoAttributes
-
- setGenres(NewsAttributes.NewsGenre[]) - Method in class crawlercommons.sitemaps.extension.NewsAttributes
-
- setGeoLocation(String) - Method in class crawlercommons.sitemaps.extension.ImageAttributes
-
- setHref(URL) - Method in class crawlercommons.sitemaps.extension.LinkAttributes
-
- setKeywords(String[]) - Method in class crawlercommons.sitemaps.extension.NewsAttributes
-
- setLanguage(String) - Method in class crawlercommons.sitemaps.extension.NewsAttributes
-
- setLastModified(Date) - Method in class crawlercommons.sitemaps.AbstractSiteMap
-
- setLastModified(ZonedDateTime) - Method in class crawlercommons.sitemaps.AbstractSiteMap
-
- setLastModified(String) - Method in class crawlercommons.sitemaps.AbstractSiteMap
-
- setLastModified(String) - Method in class crawlercommons.sitemaps.SiteMapURL
-
Set when this URL was last modified.
- setLastModified(Date) - Method in class crawlercommons.sitemaps.SiteMapURL
-
Set when this URL was last modified.
- setLastModified(ZonedDateTime) - Method in class crawlercommons.sitemaps.SiteMapURL
-
Set when this URL was last modified.
- setLicense(URL) - Method in class crawlercommons.sitemaps.extension.ImageAttributes
-
- setLive(Boolean) - Method in class crawlercommons.sitemaps.extension.VideoAttributes
-
- setLoc(URL) - Method in class crawlercommons.sitemaps.extension.ImageAttributes
-
- setMaxCrawlDelay(long) - Method in class crawlercommons.robots.SimpleRobotRulesParser
-
Set the max value in milliseconds accepted for the
Crawl-Delay
directive.
- setMaxWarnings(int) - Method in class crawlercommons.robots.SimpleRobotRulesParser
-
Set the max number of warnings about parse errors logged per robots.txt
- setName(String) - Method in class crawlercommons.sitemaps.extension.NewsAttributes
-
- setParams(Map<String, String>) - Method in class crawlercommons.sitemaps.extension.LinkAttributes
-
- setPlayerLoc(URL) - Method in class crawlercommons.sitemaps.extension.VideoAttributes
-
- setPrice(Float) - Method in class crawlercommons.sitemaps.extension.VideoAttributes.VideoPrice
-
- setPrices(VideoAttributes.VideoPrice[]) - Method in class crawlercommons.sitemaps.extension.VideoAttributes
-
- setPriority(double) - Method in class crawlercommons.sitemaps.SiteMapURL
-
Set the URL's priority to a value between [0.0 - 1.0] (Default Priority
is used if the given priority is out of range).
- setPriority(String) - Method in class crawlercommons.sitemaps.SiteMapURL
-
Set the URL's priority to a value between [0.0 - 1.0] (Default Priority
is used if the given priority missing or is out of range).
- setProcessed(boolean) - Method in class crawlercommons.sitemaps.AbstractSiteMap
-
- setPublicationDate(ZonedDateTime) - Method in class crawlercommons.sitemaps.extension.NewsAttributes
-
- setPublicationDate(ZonedDateTime) - Method in class crawlercommons.sitemaps.extension.VideoAttributes
-
- setRating(Float) - Method in class crawlercommons.sitemaps.extension.VideoAttributes
-
- setRequiresSubscription(Boolean) - Method in class crawlercommons.sitemaps.extension.VideoAttributes
-
- setRestrictedCountries(String[]) - Method in class crawlercommons.sitemaps.extension.VideoAttributes
-
- setRestrictedPlatforms(String[]) - Method in class crawlercommons.sitemaps.extension.VideoAttributes
-
- setStockTickers(String[]) - Method in class crawlercommons.sitemaps.extension.NewsAttributes
-
- setStrictNamespace(boolean) - Method in class crawlercommons.sitemaps.sax.DelegatorHandler
-
- setStrictNamespace(boolean) - Method in class crawlercommons.sitemaps.SiteMapParser
-
- setTags(String[]) - Method in class crawlercommons.sitemaps.extension.VideoAttributes
-
- setThumbnailLoc(URL) - Method in class crawlercommons.sitemaps.extension.VideoAttributes
-
- setTitle(String) - Method in class crawlercommons.sitemaps.extension.ImageAttributes
-
- setTitle(String) - Method in class crawlercommons.sitemaps.extension.NewsAttributes
-
- setTitle(String) - Method in class crawlercommons.sitemaps.extension.VideoAttributes
-
- setType(AbstractSiteMap.SitemapType) - Method in class crawlercommons.sitemaps.AbstractSiteMap
-
- setUploader(String) - Method in class crawlercommons.sitemaps.extension.VideoAttributes
-
- setUploaderInfo(URL) - Method in class crawlercommons.sitemaps.extension.VideoAttributes
-
- setUrl(URL) - Method in class crawlercommons.sitemaps.SiteMapURL
-
Set the URL.
- setUrl(String) - Method in class crawlercommons.sitemaps.SiteMapURL
-
Set the URL.
- setURLFilter(Function<String, String>) - Method in class crawlercommons.sitemaps.sax.DelegatorHandler
-
- setURLFilter(Function<String, String>) - Method in class crawlercommons.sitemaps.SiteMapParser
-
Set URL filter function to normalize URLs found in sitemaps or filter
URLs away if the function returns null.
- setURLFilter(URLFilter) - Method in class crawlercommons.sitemaps.SiteMapParser
-
- setValid(boolean) - Method in class crawlercommons.sitemaps.SiteMapURL
-
Valid means that it follows the official guidelines that the siteMapURL
must be under the base url
- setViewCount(Integer) - Method in class crawlercommons.sitemaps.extension.VideoAttributes
-
- SimpleRobotRules - Class in crawlercommons.robots
-
Result from parsing a single robots.txt file - which means we get a set of
rules, and an optional crawl-delay, and an optional sitemap URL.
- SimpleRobotRules() - Constructor for class crawlercommons.robots.SimpleRobotRules
-
- SimpleRobotRules(SimpleRobotRules.RobotRulesMode) - Constructor for class crawlercommons.robots.SimpleRobotRules
-
- SimpleRobotRules.RobotRule - Class in crawlercommons.robots
-
Single rule that maps from a path prefix to an allow flag.
- SimpleRobotRules.RobotRulesMode - Enum in crawlercommons.robots
-
- SimpleRobotRulesParser - Class in crawlercommons.robots
-
This implementation of
BaseRobotsParser
retrieves a set of
rules
for an agent with the given name from the
robots.txt
file of a given domain.
- SimpleRobotRulesParser() - Constructor for class crawlercommons.robots.SimpleRobotRulesParser
-
- SimpleRobotRulesParser(long, int) - Constructor for class crawlercommons.robots.SimpleRobotRulesParser
-
- SITEMAP - Static variable in class crawlercommons.sitemaps.Namespace
-
- SiteMap - Class in crawlercommons.sitemaps
-
- SiteMap() - Constructor for class crawlercommons.sitemaps.SiteMap
-
- SiteMap(URL) - Constructor for class crawlercommons.sitemaps.SiteMap
-
- SiteMap(String) - Constructor for class crawlercommons.sitemaps.SiteMap
-
- SiteMap(URL, Date) - Constructor for class crawlercommons.sitemaps.SiteMap
-
- SiteMap(String, String) - Constructor for class crawlercommons.sitemaps.SiteMap
-
- SITEMAP_EXTENSION_NAMESPACES - Static variable in class crawlercommons.sitemaps.Namespace
-
- SITEMAP_LEGACY - Static variable in class crawlercommons.sitemaps.Namespace
-
Legacy schema URIs from prior sitemap protocol versions and frequent
variants.
- SITEMAP_SUPPORTED_NAMESPACES - Static variable in class crawlercommons.sitemaps.Namespace
-
- SiteMapIndex - Class in crawlercommons.sitemaps
-
- SiteMapIndex() - Constructor for class crawlercommons.sitemaps.SiteMapIndex
-
- SiteMapIndex(URL) - Constructor for class crawlercommons.sitemaps.SiteMapIndex
-
- SiteMapParser - Class in crawlercommons.sitemaps
-
- SiteMapParser() - Constructor for class crawlercommons.sitemaps.SiteMapParser
-
- SiteMapParser(boolean) - Constructor for class crawlercommons.sitemaps.SiteMapParser
-
SiteMapParser with configurable location validation, not allowing
partially parsed content.
- SiteMapParser(boolean, boolean) - Constructor for class crawlercommons.sitemaps.SiteMapParser
-
- SiteMapTester - Class in crawlercommons.sitemaps
-
Sitemap Tool for recursively fetching all URL's from a sitemap (and all of
it's children)
- SiteMapTester() - Constructor for class crawlercommons.sitemaps.SiteMapTester
-
- SiteMapURL - Class in crawlercommons.sitemaps
-
The SitemapUrl class represents a URL found in a Sitemap.
- SiteMapURL(String, boolean) - Constructor for class crawlercommons.sitemaps.SiteMapURL
-
- SiteMapURL(URL, boolean) - Constructor for class crawlercommons.sitemaps.SiteMapURL
-
- SiteMapURL(String, String, String, String, boolean) - Constructor for class crawlercommons.sitemaps.SiteMapURL
-
- SiteMapURL(URL, Date, SiteMapURL.ChangeFrequency, double, boolean) - Constructor for class crawlercommons.sitemaps.SiteMapURL
-
- SiteMapURL(URL, ZonedDateTime, SiteMapURL.ChangeFrequency, double, boolean) - Constructor for class crawlercommons.sitemaps.SiteMapURL
-
- SiteMapURL.ChangeFrequency - Enum in crawlercommons.sitemaps
-
Allowed change frequencies
- SkipLeadingWhiteSpaceInputStream - Class in crawlercommons.sitemaps
-
Wraps a stream and skips over leading whitespace (at beginning of file) in
the wrapped stream.
- SkipLeadingWhiteSpaceInputStream(InputStream) - Constructor for class crawlercommons.sitemaps.SkipLeadingWhiteSpaceInputStream
-
- sortRules() - Method in class crawlercommons.robots.SimpleRobotRules
-
In order to match up with Google's convention, we want to match rules
from longest to shortest.
- startElement(String, String, String, Attributes) - Method in class crawlercommons.sitemaps.sax.DelegatorHandler
-
- startElement(String, String, String, Attributes) - Method in class crawlercommons.sitemaps.sax.extension.ImageHandler
-
- startElement(String, String, String, Attributes) - Method in class crawlercommons.sitemaps.sax.extension.LinksHandler
-
- startElement(String, String, String, Attributes) - Method in class crawlercommons.sitemaps.sax.extension.MobileHandler
-
- startElement(String, String, String, Attributes) - Method in class crawlercommons.sitemaps.sax.extension.NewsHandler
-
- startElement(String, String, String, Attributes) - Method in class crawlercommons.sitemaps.sax.extension.VideoHandler
-
- STOCK_TICKERS - Static variable in class crawlercommons.sitemaps.extension.NewsAttributes
-
- strict - Variable in class crawlercommons.sitemaps.SiteMapParser
-
True (by default) meaning that invalid URLs should be rejected, as the
official docs allow the siteMapURLs to be only under the base url:
https://www.sitemaps.org/protocol.html#location
- strictNamespace - Variable in class crawlercommons.sitemaps.SiteMapParser
-
Indicates whether the parser should work with the namespace from the
specifications or any namespace.
- Strings - Class in crawlercommons.utils
-
Util functions for manipulating strings.
- Strings() - Constructor for class crawlercommons.utils.Strings
-
- stripAllBlank(CharSequence) - Static method in class crawlercommons.sitemaps.sax.DelegatorHandler
-
Trim all whitespace including Unicode whitespace
- SuffixTrie<V> - Class in crawlercommons.domains
-
- SuffixTrie() - Constructor for class crawlercommons.domains.SuffixTrie
-
- SuffixTrie.LookupResult<V> - Class in crawlercommons.domains
-
Wrapper for results when a string is checked for suffixes contained in
the suffix trie.
- SuffixTrie.Node<V> - Class in crawlercommons.domains
-
- valueOf(String) - Static method in enum crawlercommons.filters.basic.BasicURLNormalizer.IdnNormalization
-
Returns the enum constant of this type with the specified name.
- valueOf(String) - Static method in enum crawlercommons.robots.SimpleRobotRules.RobotRulesMode
-
Returns the enum constant of this type with the specified name.
- valueOf(String) - Static method in enum crawlercommons.sitemaps.AbstractSiteMap.SitemapType
-
Returns the enum constant of this type with the specified name.
- valueOf(String) - Static method in enum crawlercommons.sitemaps.extension.Extension
-
Returns the enum constant of this type with the specified name.
- valueOf(String) - Static method in enum crawlercommons.sitemaps.extension.NewsAttributes.NewsGenre
-
Returns the enum constant of this type with the specified name.
- valueOf(String) - Static method in enum crawlercommons.sitemaps.extension.VideoAttributes.VideoPriceResolution
-
Returns the enum constant of this type with the specified name.
- valueOf(String) - Static method in enum crawlercommons.sitemaps.extension.VideoAttributes.VideoPriceType
-
Returns the enum constant of this type with the specified name.
- valueOf(String) - Static method in enum crawlercommons.sitemaps.SiteMapURL.ChangeFrequency
-
Returns the enum constant of this type with the specified name.
- values() - Static method in enum crawlercommons.filters.basic.BasicURLNormalizer.IdnNormalization
-
Returns an array containing the constants of this enum type, in
the order they are declared.
- values() - Static method in enum crawlercommons.robots.SimpleRobotRules.RobotRulesMode
-
Returns an array containing the constants of this enum type, in
the order they are declared.
- values() - Static method in enum crawlercommons.sitemaps.AbstractSiteMap.SitemapType
-
Returns an array containing the constants of this enum type, in
the order they are declared.
- values() - Static method in enum crawlercommons.sitemaps.extension.Extension
-
Returns an array containing the constants of this enum type, in
the order they are declared.
- values() - Static method in enum crawlercommons.sitemaps.extension.NewsAttributes.NewsGenre
-
Returns an array containing the constants of this enum type, in
the order they are declared.
- values() - Static method in enum crawlercommons.sitemaps.extension.VideoAttributes.VideoPriceResolution
-
Returns an array containing the constants of this enum type, in
the order they are declared.
- values() - Static method in enum crawlercommons.sitemaps.extension.VideoAttributes.VideoPriceType
-
Returns an array containing the constants of this enum type, in
the order they are declared.
- values() - Static method in enum crawlercommons.sitemaps.SiteMapURL.ChangeFrequency
-
Returns an array containing the constants of this enum type, in
the order they are declared.
- VIDEO - Static variable in class crawlercommons.sitemaps.Namespace
-
- VideoAttributes - Class in crawlercommons.sitemaps.extension
-
Data model for Google extension to the sitemap protocol regarding images
indexing, as per http://www.google.com/schemas/sitemap-video/1.1
- VideoAttributes() - Constructor for class crawlercommons.sitemaps.extension.VideoAttributes
-
- VideoAttributes(URL, String, String, URL, URL) - Constructor for class crawlercommons.sitemaps.extension.VideoAttributes
-
- VideoAttributes.VideoPrice - Class in crawlercommons.sitemaps.extension
-
- VideoAttributes.VideoPriceResolution - Enum in crawlercommons.sitemaps.extension
-
- VideoAttributes.VideoPriceType - Enum in crawlercommons.sitemaps.extension
-
- VideoHandler - Class in crawlercommons.sitemaps.sax.extension
-
Handle SAX events in the Google Video sitemap extension namespace.
- VideoHandler() - Constructor for class crawlercommons.sitemaps.sax.extension.VideoHandler
-
- VideoPrice(String, Float) - Constructor for class crawlercommons.sitemaps.extension.VideoAttributes.VideoPrice
-
- VideoPrice(String, Float, VideoAttributes.VideoPriceType) - Constructor for class crawlercommons.sitemaps.extension.VideoAttributes.VideoPrice
-
- VideoPrice(String, Float, VideoAttributes.VideoPriceType, VideoAttributes.VideoPriceResolution) - Constructor for class crawlercommons.sitemaps.extension.VideoAttributes.VideoPrice
-
- VIEW_COUNT - Static variable in class crawlercommons.sitemaps.extension.VideoAttributes
-