Archive for April, 2007

Scraping AOL WebMail for contacts

Monday, April 30th, 2007

This is the 3rd post in a short series discussing how I built an API to grab contact list information from Yahoo!, AOL, GMail and Hotmail.  In our first post we reviewed the high level approach to scraping sites.  In our second post we went over how to scrape Yahoo! ...

Scraping Yahoo! for contacts using JScrape

Sunday, April 29th, 2007

This post builds on my previous post, in which we discuss how to scrape webmail sites for contacts.  Yahoo! is by far the easiest of the sites to scrape (of the major sites).  After you've sniffed the URLs used for the login you just need to replace the username and ...

Scraping WebMail sites for contacts using JScrape

Sunday, April 29th, 2007

Many new websites, especially those that depend on social networks, are now offering ways to import contacts from various WebMail sites.  I'm not going to go into the ethics of asking a user for their user name and password to a webmail site and scraping the site but I will ...

Improving performance of Taste using DBCP

Wednesday, April 25th, 2007

For the past few weeks I've been playing with Taste, a Java based framework for collaborative filtering (basically the recommendation feature found on sites like Amazon and Netflix).    Hopefully in the near feature this tool will be incorporated in our site, MyFriendSuggests.com to improve our suggestion algorithms.  What I found was ...

Tableless Website development

Sunday, April 15th, 2007

This is probably old news to everyone but just in case there are other people like me out there I figured I would make this post.  When I started building my site, I began hand-coding a layout of HTML tables in my JSP code.  That was definitely a mistake.  Using HTML ...

MLB Tracker Update

Wednesday, April 11th, 2007

We've just released the latest version of MLB Tracker, version 0.85.  Amongst some minor bug fixes the biggest change is that we now have MLB Tracker working for both BIS and BES connected blackberry's.  If you own a blackberry and enjoy baseball be sure to give MLB Tracker a try!Share This

Setting up eclipse to run web-app under root context

Tuesday, April 10th, 2007

After fighting with eclipse a little I was able to get my eclipse 3.2 running with WST to deploy my web application to the root context with Tomcat.   The trick was that I had to manually edit the file: <Workspace>\<project>\.setting\org.eclipse.wst.common.component I had to remove the value in the context root line so ...

MLB Tracker 2007 Released

Friday, April 6th, 2007

We've re-released MLB Tracker for 2007.  It has basically the same features (and bugs) as the 2006 edition, but we've updated it for the new URLs used by the site we scrape data from.  MLB Tracker is an application for your blackberry that allows you to "virtually watch" a game by ...

Blackberry Development - Lessons Learned Part 1

Thursday, April 5th, 2007

As I prepped the release of our 2007 edition of MLBTracker, I found out that the software wasn't working correctly on newer blackberries (the 8100 and 8800 use the 4.2 OS).  Unfortunately debugging the application wasn't easy at all.  What I ended up needing to do was install the 4.2 ...

Introducing JScrape - Java based HTML Scraping API

Sunday, April 1st, 2007

A few pieces of software I've worked on have required me to scrape data from existing websites.  In general the code to do this is ugly.  The way I had been doing it was using the standard java connectivity classes to grab the data from the site and then parsing ...