Data mining is fun!

I spent most of the weekend doing a mass conversion of a couple of forums I help maintaining, transferring all the posts and threads from HTML files (created by Infopop’s UBB) into a database.

It’s fascinating observing the weird data correlation between users and posts. Just define some queries and the most interesting data points emerge defining trends both in topics and time-related correlations.

Once all the posts are transferred (I’m about 30% of the way), a friend of mine will write a Lucene wrapper around them to further enable some kind of datamart functionality.

