Parsing web sitemaps using JavaScript

Recently I came across an interesting project by Sean Thomas Burke called Sitemapper. This is a mini framework, which can be used to parse through sitemap XML files to get all included URLs. Such functionality is necessary when crawling through websites, as the sitemap (usually) holds an up-to-date list of all website URLs. In most cases this list should be enough when designing a crawler and you wouldn’t need to crawl manually the website and create a list of URLs. Sitemap parser: Sitemapper Sitemapper is a well-maintained and well-documented, open-source library offering the following features: Follows redirects Supports gzip sitemaps...

HTTP status codes

If you are working with analytics you have to know what each HTTP status code means. Every piece of information travelling through the webs is also carrying a status code with it every time it travels. HTTP status codes are standard response codes given by web site servers on the Internet. The codes help identify the cause of the problem when a web page or other resource does not load properly. Every status code consists of 3 digits and belongs to one of 5 main groups. The first digit of the status code indicates the general type of response (the main group of the status...

Excel function for MD5 hashing without VBA

When capturing PII data (Personally identifiable information) in GA or Adobe analytics, you need to make sure that the values captured are encrypted/hashed to respect the rules of these platforms. Otherwise you might have your account deactivated without any prior notice! A very common hashing algorithm is MD5. It produces a 128-bit hash value and it’s a one-way hashing algorithm, meaning that you cannot convert the hashed value back to the original one. (Keep in mind that MD5 hashes are only secure when using a unique input value, to prevent reverse lookup attacks e.g. using https://md5.gromweb.com/ ) To be able...

Cufon - Fonts for the people

Cufon text does not appear in Internet Explorer

In case you encounter any problems with your site's cufonized fonts try to update you rendering engine to the latest version. On October 24 they released released version 1.09i, which is the same as 1.09 but IE9-compatible. Keep in mind that you do not need to convert your font generated files again, just replace your old cufon-yui.js with a new one and you're good to go.

twitter bug

Twitterizer: Solving the 401 bug when calling "GetAccessToken" or "GetRequestToken"

I recently encountered a strange bug using the Twitterizer API. I couldn’t call “GetAccessToken” without getting a 401 error. After searching about it, I found out that other people encountered the same bug by calling the “GetRequestToken” method (see more info here).  The correct way to get an access token using the twitterizer API is using the following source code sequence: The source code above may crash on the last line. The bug is probably caused by wrong timestamps, Twitter is sensitive to server time inaccuracies. The rash of issues lately is apparently caused by the clocks on Twitter’s servers being...