Sports Betting With the Boys - NBA Prop Bets - Part 1

About a month ago, a good friend of mine asked me if I would want to open a DraftKings account. We’d both get a $100 free bet, and I would also receive a $100 free casino promo. My friend knows that I am a risk averse individual, but also knows I can’t pass up on a good deal, so he added that I could just do these two promos and cash out. Thereafter, I opened the account, but was quickly captivated by the complexity of the DraftKings application. At any given moment, you can open the app, and with just a few clicks, place a bet on sports or gamble in their online casino. The DraftKings app is super well developed, intuitive, and their algorithms, that determine the predicted game outcomes (my friends like to call this Vegas), are so close to actual outcomes, it’s scary.

After playing black jack with my free casino credits, I had $230 to my name. I had the opportunity to withdraw my funds, but I was curious about the bets my friends were making. I was added to a GroupMe group chat, which to my surprise, had a majority of my college friends in it. I didn’t have much to say in the discussions there, but I did notice that the discussions and debates were all speculative and anecdotal about which players and teams were going to win/lose based on past performance (usually last event). I, for one, am not the biggest sports fan or the most knowledgable about sports in general; however, I do know a lot about probability and statistics. My friends’ methods for determining what would be winning bets did not sit well with me and I often found myself shaking my head in disapproval even though they couldn’t see me doing it. Shortly after joining their GroupMe chat and hoping to add some value to the group, I set out to create a tool for the boys that would help guide their bets and discussions using data.

Being new to sports betting myself, I had to quickly learn the lingo. Prop Bets? Teasers? Parlays? Overs? Unders? Spreads? ALTNERATIVE LINES? What did it all mean?

While there were many new concepts I had to learn, for the purposes of this post, it’ll be important to understand what a prop bet is and the concept of over/under. A prop bet in sports betting, short for proposition bet, is a wager that is not directly tied to the final score or final outcome. For example, in basketball, some popular prop bets are total player points, total 3-pointers made by a player, who will have a double-double, first team to score, and the focus of this post - Points, Rebounds & Assists.

A Points, Rebounds & Assists bet assesses the total of a particular player’s performance for a single game. A number is set with that you can bet over or under on the total number of combined points, rebounds and assists that player will get. For example, let’s take a game that LeBron James is playing in. His total points, rebounds and assists might be set at 48.5. That means his total point, rebounds and assists need to be at least 49 to go over or no more than 48 to go under. A stat line of 28 points, 9 assists and 11 rebounds would be under (48 total). As would a line of 35 points, 3 assists and 10 rebounds (48 total). It doesn’t matter where the bulk comes from, only the total of the three categories added together. Lastly, for this specific type of player prop, there is no minimum on the total of the three values, so if a player had a total of 49 for points, rebounds, and assists and they finished with 50 points, 0 rebounds, and 0 assists, that would still be considered a win if you bet the over. Props are a great way to add additional action to a game, and are often easier to beat than efficient markets like point spreads.

With your newly gained understanding of prop bets, and a brief background on what I am trying to solve for, let’s get into the solution.

Solution

This script scrapes DraftKings for current day’s NBA Points, Rebounds & Assists data. Additionally, this script will scrape the last five game player averages for various stats from basketball.realgm.com. The final product of this script is a table that contains, by player, the odds, lines, and last five game player averages so that whoever is assessing the table can make more educated decisions about which players to bet on.

How I Did It

  1. Obtain DraftKings NBA Prop Bet Lines - Scrape DraftKings for the Total Points, Rebounds & Assists subcategory

  2. Obtain NBA Player Averages - Scrape Basketball.realgm.com for NBA player averages for the last five games

  3. Combine DraftKings and Player Averages Data - Combine these two tables on player name. Returns a table with the current day’s spreads by player along with their five game averages for a various stats.

You can view the full script here

Obtaining DraftKings NBA Prop Bet Lines

If you’re unfamiliar with the layout of the DraftKings’ website, feel free to take a trip on over to DraftKings over take a look at image below. As you can see in the screenshot, I’m showing the information I am interested in - Points, Rebounds & Assists. You can get here by going to NBA -> Player Props -> Points, Rebounds & Assists. In this subcategory, Draftkings shows, by game, for the present day, what the lines are for each available player in each game. For example, you can see that the under/over is set at 36.5 total rebound, points, and assists for John Wall. If you were to bet $100 on either the over or under, the total payout would be $190.91, thereby making you a profit of $90.91.

Screen Shot 2021-02-17 at 8.33.25 AM.png

If you’ve read my Craigslist TCG Web Scraper post, you’ll know that I’ve previously explored web scraping using BeatifulSoup. However, in this case, I realized after inspecting the webpage, obtaining the information from DraftKings won’t be feasible using the same method I used to scrape Craigslist.

Screen Shot 2021-02-19 at 7.32.22 PM.png

For example, when looking for Pat Williams, of the Chicago Bulls, his spread exists in a location that’s different from his name. If I were to scrape the site the way I typically do, I would be unable to associate the spreads with their respective players. After doing some digging on the internet, I found another way to access this information. In this StackOverflow article, the author of the post (asking the question) mentions getting data from the DraftKings API (https://sportsbook.draftkings.com/api/odds/v1/leagues/3/offers/gamelines.json) to get similar data to what I need. I did some more research on what the author recommends, and apparently it’s an unofficial DraftKing’s endpoint that generates a JSON of data for a given category. After playing around with the url string, I was able to find where the Points, Rebounds & Assists data exists as a JSON - https://sportsbook.draftkings.com/sites/US-NJ-SB/api/v1/eventgroup/103/full?format=json. The 103 you see in the string refers to the event category, which can be derived from the original DraftKing’s url - https://sportsbook.draftkings.com/leagues/basketball/103?category=player-props&subcategory=points,-rebounds-&-assists. The JSON looks like this.

Screen Shot 2021-02-19 at 8.03.04 PM.png

Just by looking at the JSON, we can spot out the keys and values we’ll need to get out of this dictionary. We can see that the label key contains the over/under value (i.e Over 36.5), the oddsAmerican key contains the value for the betting odds (i.e -110), and the participant key contains the name of the NBA player such as Chris Paul or Lonzo Ball. Given the structure of the JSON, I figured that the best way to obtain the data I needed would be to go line by line and append the participant, label, and oddsAmerican to a list. After that, I would combine the lists to create a data frame of all the players and their respective lines. While in theory, this was how I tackled the problem, it was in fact a bit more tricky. Let’s first start with reading in the JSON.

The outputs of html and json_dict are pictured below.

Screen Shot 2021-02-20 at 10.17.53 AM.png

I’m first reading in the data from "https://sportsbook.draftkings.com/sites/US-NJ-SB/api/v1/eventgroup/103/full?format=json" and then creating a JSON out of it, which makes it easier to work in terms of structure and its visual organization.

Next, I had to manually comb through the JSON to find where the player and betting line information exists. To do this, I manually checked the JSON objects (keys, values) until I found the keys that contained the data I needed. You can do this by calling .key() on the JSON, and it will return all the keys. Additionally, you can call .values() on a key to see all of the data stored in the key. Below is a screenshot of how I started this process.

Screen Shot 2021-02-20 at 10.31.03 AM.png

Finally, I ended up figuring out where the data exists, which looks like this. As you can now see, we have the Under/Over for Marc Gasol. Next, I needed to figure out how to get all of the fields I needed out of the JSON and into a table. This again was pretty complicated.

Screen Shot 2021-02-20 at 10.35.26 AM.png

After hours of trial and error, I figured out how to loop through the JSON to get the information I need. It’s important to first understand how the data looks on DraftKings in order understand how the dictionary is built and understand my logic behind the solution.

Screen Shot 2021-02-20 at 11.02.29 AM.png

The logic for obtaining the data reads like this -

  • For each game being played on the present day (in this case 2 games)

    • Get the number of players per game

Using this information, I’ll loop through Game 1 nine times (for each of the 9 players in Game 1) and for each of the nine times, get the over data and then get the under data. Repeat the same for Game 2.

Now knowing the structure of the data and logic, let’s get into developing a solution. First, let’s start with determining the number of games being played.

len.png

When calling len() on category, we can see that 2 is returned (because there are 2 games). To loop through each game, we want to create a list to loop through.

Next, for each game in category, let’s get the number of players per game. We will append this to a list called num_players.

Printing the list, we see that it returns the expected value, which I showed in the DraftKing’s screenshot. Game 1 has eight players and Game 2 has nine players.

Screen+Shot+2021-02-20+at+11.42.00+AM.jpg

The goal of this current exercise is to create two lists, one for games, and one for players, so that we can tell our script how many times we should loop through each game in order to get all the data about each player. If we were to loop using num_players currently, we’d only get information about the 9th player in Game 1 and the 8th player in Game 2, since those are the only two values in the list. We need to create a range of numbers, starting from 0, so that that we can loop from the first player to the 9th player in Game 1 and the first player to the 8th player in Game 2. You can create that range like this.

num_player_range.png

Now that we have a list of lists that represents the number of players in each game, we can go ahead and create a list of game and player combinations. This is so that the loop logic knows how many players it should get data for in the first game, before getting players in the second game. Look at the output of game_combo. You can see that for Game 1 (represented by 0 because it’s an index), we have all combinations of 0 and 8 (which represents the 9 player because of indexing in Python). Once the for loop gets to (0,8), the script will then start getting information about players in the second game (represented by (1,#)).

game_combo.png

Now we can loop over category using game_combo as seen below. The player information is stored in category[x][y][‘outcomes’]. This information will be appended to the list player_info. Note that every player added to player_info from category[x][y][‘outcomes’] can be indexed once more. See how in the screenshot below, Marc Gasol’s under and over information is split by index. For example, if I call category[x][y][‘outcomes’][0] I will get Gasol’s over, while if I call category[x][y][‘outcomes’][1], I will get Gasol’s under.

player_info.png

Before the final step, we need to tell the script how many players there are in total so that we can get the over and under for each player. We can quickly figure this out by looking at the length of player_info.

total_num_players.png

Finally, we can now get all of the data about each player. We’ll do the following - For each player in total_number_players (17), get the over and under information and append the participant, label, and odds information to their respective lists so that we can then create a data frame containing all the players. Note that there will be, for each player, one row for the over and one row for the under. We will group the under and over together by player in the data frame creation in the next step using a group by function.

Lastly, let’s create the data frame using the three lists from the previous step. We will also group by player to get the following data frame (left) which contains the same information that is in the screenshot taken from DraftKings. Now that we have our table with player, lines, and odds, we want to get stats about these players, from their last five games, so that we can make educated bets and not wild guesses based on a player’s previous one game performance.

Obtain NBA Player Averages

After a quickly browsing through some Google search results, I came across basketball.realgm.com. On this site, they host a Last 5 Game Average category, where, for each player, they calculate and show the player’s averages over the last five games in a table format for numerous statistics. You can check out the table here. I thought that I may have to calculate the stats for each player myself, but I was very happy to leverage their data. The script below is annotated to explain how I go about getting data for each player across the five pages of stats basketball.realgm.com hosts. I am taking a very similar approach to scraping basketball.realgm.com’s data as I did to Craigslist, which you can read more about here. If you want to read more about each of the stats that basketball.realgm.com captures you can do so on their glossary.

Now that we’ve appended data to each list for every player, we need to create a data frame using each list as a column in the table. You can create a data frame using the following code block.

Join DraftKings Data With Last Five Game Player Averages

Finally, since we now have a data frame for Draftkings, indexed by player, and a data frame for basketball.realgm.com, indexed by player, we can now join the two tables together using the player name as the primary key. The single line of code below creates the data frame you see in the following screenshot; a table that contain the players’ lines, odds, and five game averages for all players currently available to be bet on in the DraftKing's Points, Rebounds & Assists player prop subcategory.

combined_odds_stats.png

I really hope you enjoyed this post and ultimately learned something new. If you’d like to view the whole script or download it, you can do so here. If you have any questions about what I wrote here or just want to leave some feedback about this post, feel free to do so in the comment section below. If you’ve read some of my previous posts, you’ll know that I am learning Python, and through blogging, I’m improving my skills with these personal projects and exercises that are in my areas of interest. If you’d like to work on a project together or want to recommend ways to improve this script, please don’t hesitate to reach out. Thanks for reading.

Be on the lookout for Parts Two and Three of the Sports Betting With the Boys series, where I will show you how I set up a GroupMe bot using Python that sends the current day’s Points, Rebound & Assists lines automatically, at a scheduled time, to our GroupMe group chat. Feel free to subscribe to my blog so that you will get my latest posts when I release them. Scroll down and enter your email to stay in the loop.

Previous
Previous

Untappd Heat Map - Visualizing My Beer Drinking History

Next
Next

The Reason Why I Own Shares of the Gap