Connect to Reddit from SQL Server

  1. You’ve installed SQL Server with Python
  2. You’ve then installed pip
  3. Then you used pip to install PRAW
  4. You’ve also installed Pandas using pip
  5. You’ve created your Reddit API

Fantastic, we’re about ready to connect to Reddit from within SQL Server!

Let’s crank open SSMS and start building our query. In a new query window we’ll start building our query and test it along the way.

To execute our script we’re going to use dynamic SQL to make it easier to read. Let’s get started setting down some foundations for our script.

The first line is the ‘shebang’ line and just allows the system to find Python correctly, we won’t be changing this at all. We’re also loading PRAW and Pandas (that we installed previously) so we can use them later.

Feel free to run this script as it is, you won’t get any results at this point but you should also not see any error messages

We’re going to build our connection string to Reddit next.

We’re creating a variable (redditConnect) and declaring the values we need. Use your own connection details we made when creating our Reddit API.

We then need to decide which Subreddit we’re going to gather data from. I’m going to use AskReddit as it’s one of the largest text-only subreddits and will be perfect for what we’re trying to do here.

We also need to decide what order we’re going to put the data into. We can sort by things like new, top, controversial etc. For this let’s grab ‘new’

subredditOrder = subreddit.new()

We can then test our connection is working by printing the first row of our data

Our whole block of code is going to look something like this

Go ahead and run this script (with your own info in there) and you should see a print out of the latest submission to your chosen Subreddit. Something like this.

If this doesn’t work for you then you may have to allow this connection through your firewall. In my example I disabled Windows Firewall but your method will be different depending on what you use for a firewall.

Congratulations. Next step is to actually do something with this data.

.

5 thoughts to “Connect to Reddit from SQL Server”

    1. I love stuff like this. I needed sample data for my dev database and thought of Reddit with it’s text only Subreddits. Getting the data into your database is the next post.

      I’m really tempted to build a PowerBI dashboard for Reddit browsing, pictures would be the issue there I think.

  1. I usually tap into various forms of open government data for various projects and tests. There’s tera-terabytes of interesting data to play with.

Leave a Reply