Using Python, Pandas, and Plotly to Generate NBA Shot Charts

Sam Liebman
Towards Data Science
5 min readJul 26, 2018

--

Ever since I first stumbled upon an NBA shot chart visualization, while reading a Kirk Goldsberry CourtVision piece on Grantland.com, I have been fascinated by sports data visualizations, and intrigued as to how they can lead to enhanced decision making for teams and players. So, as I recently embarked on a mission to learn data science, I knew, naturally, that one of my first side projects would be an attempt to recreate an NBA player’s shot chart.

Gathering the Data

My first step was to gather the location data for each shot. As sports analytics have progressed to the point where nearly every team has a dedicated analytics department, the professional leagues have followed suit and beefed up their stats departments. The NBA provides a bunch of detailed statistics on its website, gathered using Second Spectrum cameras installed in every arena to track the real-time movement of players and the ball. Examples of data gathered from Second Spectrum include speed, distance travelled, paint touches, and more.

Unfortunately, the NBA does not allow API access on its stats page, so I had to dig deeper to find what I was looking for — detailed shot data for every player. On each player’s shot detail page, as seen below, you can view at a variety of shot plots, and sift through spreadsheet with basic details for every shot taken, with an added bonus of seeing a video of each shot with a simple click of the play button.

Despite the abundance of data available on the webpage, the data was not in a structure that would allow me to properly analyze it, nor was the entirety of the data I sought accessible. However, using Google Chrome developer tools, specifically XHR (XMLHttpRequest), I was able to get a json file that could easily be read and transformed using Pandas.

Chrome XHR tab and resulting json linked by url

While this was the exact data I was looking for, I yearned for a way to gather the data for any player, or a group of players, without having to repeat these tedious steps. After noticing that the 725 character url string had only one conditional value — the player_id — that needed to be changed in order to pull up someone else’s shot data, I wrote a python script to gather the data for any player given their nba.com ID. Using the Requests python library, I was able to grab the json data without actually having to go on the website.

Code to grab json data from stats.nba.com and transform into Pandas DataFrame

Cleaning the Data

Pandas in an incredible python library that, amongst its other features, allowed me to turn the json into a DataFrame and clean the data to only display the values and columns I wanted. Without much effort, Pandas transforms the ugly json structure into a clean, easy to read format. Pandas also provides some neat features, such as the .describe() method, which automatically computes and outputs a sample of relevant statistical data, such as the mean, median, and standard deviation for all numeric columns. This single line of code, as seen below, gave me immediate insights into the shot choices of New York Knicks’ star forward Kristaps Porzingis. For example, Porzingis’ average shot distance was 14.28 feet away from the basket, and his typical shot was ever so slightly from the left side of the basket.

Sampling of data from Pandas DataFrame
.describe() method for Kristaps Porzingis’ DataFrame

Plotting the Shots

Now that I had all of the data I required in the correct format, I was able to proceed to plotting every shot using Plotly, a charting and visualization library. First, I used Savvas Tjortjoglou’s amazing template to draw a court outline with exact dimensions. Next, I separated all of the shots into two categories — makes and misses — which I could then plot as different colors. Using the Plotly graph objects module, I was able to plot all of the makes and misses as a scatter plot over the court outline, setting the x and y values to X_LOC and Y_LOC respectively, which represent the coordinates for each shot attempt.

Code for creating scatter plot for all shots.

From there, all I needed to do was run my program, and, as if it was magic, each shot was plotted as designed.

Conclusion

This was an extremely fun and fulfilling side project for me. It combined my passion for sports with a variety of skills I have learned in the month or so since I started programming. In the future, I want to make my shot plots interactive, and touch them up to make them more visually appealing and enhance readability. I also aim to introduce variations such as replacing the overlapping individual shot data points with hexbins representing frequency and efficiency for each shot location, or by adding a comparison to the league average.

--

--