Recently I came across an interesting project by Sean Thomas Burke called Sitemapper. This is a mini framework, which can be used to parse through sitemap XML files to get all included URLs. Such functionality is necessary when crawling through websites, as the sitemap (usually) holds an up-to-date list of all website URLs. In most cases this list should be enough when designing a crawler and you wouldn’t need to crawl manually the website and create a list of URLs. Sitemap parser: Sitemapper Sitemapper is a well-maintained and well-documented, open-source library offering the following features: Follows redirects Supports gzip sitemaps...