Cleaning Data in SQLite

This past weekend was GenCon.  I updated my database to be able to hold BoardGameGeek Hotness rankings and wrote a script to pull the rankings every hour.  I was hoping to track what was hot during the convention.  I realized after the convention had started that the Hotness rankings only update once every day (sometime around 3 and 4 AM EST).  I was grabbing this data every hour of every day, but it was only changing once.  I wrote a new script to pull data from the GeekBuzz leaderboard (which I’ll write about soon!).

However, now I have a lot of database entries that are unneeded.  I thought I would take the time to clean this up (even though I’ll no longer use the data and could easily just delete the tables).  For the BGG Hotness, I have the tables: hotgame, hotperson, and hotcompany.  I have 7,350 rows in each of those tables, since I collected data on 50 rankings every hour for just over 6 days.  However, since the BGG hotness rankings only update daily, I really only need 300 rows (50 rankings * 6 days = 300 rows).

I know think the rankings update between 3 and 4, so I want to only keep the entries from 4:00 AM.  I use the following SELECT statement to make sure I’m in the ballpark with where the data is that I want to keep:

The output of this is 300, which is what I expected.  Next, I use the following SELECT statement to make sure I can capture the data I want to delete.

The output of this is 7,050, which is what I had hoped and expected.  I feel pretty good about deleting those 7,000 rows, so I go ahead and use the command:

The output of the SELECT statement was 300, as we would expect.  Finally, as one last sanity check, I use the following command to make sure only the 4 AM entries are now in the database:

Sure enough, the output of that is:

I went ahead and followed the same methodology for the other two tables to clean up the data.  This will make querying a lot easier in the future, should I ever want to look back on the Hotness during GenCon.

Have questions or suggestions?  Please feel free to comment below or contact me.

Leave a Reply

Your email address will not be published. Required fields are marked *