Package crawlercommons.domains
Classes contained within the domains package relate to the definition of
"paid-level" domains or "effective top-level domains",
that is Internet domain names on level below a public suffix defined in the
public suffix list.
-
Class Summary Class Description EffectiveTldFinder To determine the actual domain name of a host name or URL requires knowledge of the various domain registrars and their assignment policies.EffectiveTldFinder.EffectiveTLD EffectiveTLD objects hold one line of the public suffix list: the suffix (com
,co.uk
, etc.) for IDN suffixes: both the ASCII and IDN variant (xn--p1ai
andрф
) and the properties required to parse host/domain names given in the public suffix list: whether it's a wildcard suffix (*.kawasaki.jp
), or an exception to a wildcard rule (!
PaidLevelDomain Routines to extract the PLD (paid-level domain, as per the IRLbot paper) from a hostname or URL.SuffixTrie<V> SuffixTrie.LookupResult<V> Wrapper for results when a string is checked for suffixes contained in the suffix trie.SuffixTrie.Node<V>