Part 2 of Riot API : Surface level stories.

Justin M.
Towards Data Science
5 min readApr 24, 2017

--

Continuing from my previous post, I’ve expanded my data set by quite a lot and after tinkering and cleaning the data I’ve created a CSV of 98 matches, with each match containing 10 player data (5 v 5).

Update *While my first attempts with the Riot API was to get comfortable with API’s in general, the resulting dataset I grabbed from player Doublelift was too small to do any significant modeling on (Latest 10 games ). So the next step was to cast a wider net to work with. The result was a data set comprised of 98 matches comprised of 10 players each. Instead of selecting a subset of data based of one specific player I grabbed a collection of 25 games and within those games gathered an array of game id’s in order to subset into those matches for player data (Easier said than done). So let’s break it down.

980x by 61x

My goal was to get a dataset comprised of 1000 matches, but I conceded to the number of 980. I’m still getting comfortable with rate limits and at one point set a timer to pause after every 10 request. I do want to mention that I have some reservations of splitting the match data into separate rows linked by the match ID, since my intentions were to fit the features into a logistic regression model using binary “winner” targets. I didn’t want to perform data leakage within the features so I had to drop columns linking each row together. Therefore I felt that features like “Team” had to be dropped.

df.drop('Team',inplace=True,axis=1)
df.drop('combatPlayerScore',inplace=True,axis=1)
df.drop('unrealKills',inplace=True,axis=1)
df.drop('firstInhibitorKill',inplace=True,axis=1)
df.drop('Player:',inplace=True,axis=1)
# Later on I took out the item features
feat = [c for c in df.columns if not c.startswith("item",0,4)]
df_no_items = df[feat]

I also did a bool to int convert

df['winner'] = df['winner'].map({True:0,False:1})
df['firstBloodAssist'] = df['firstBloodAssist'].map({False:0,True:1})
df['firstBloodKill'] = df['firstBloodKill'].map({False:0,True:1})
df['firstTowerAssist'] = df['firstTowerAssist'].map({False:0,True:1})
df['firstTowerKill'] = df['firstTowerKill'].map({False:0,True:1})
df['firstInhibitorAssist'] = df['firstInhibitorAssist'].map({True:1,False:0})

Now the next issue was dealing with the Champion Id, which were made up of int’s that I wanted to dummy. This was my first fork in developing the data. Each number corresponded to a specific Lol Champion and with each Champion contained a class and sub class. For simplicity I decided to split each Champion ID with it’s associate class of [Assassin, Tank, Fighter, Mage, Marksman, Support] and dummy those categories. I did this by mapping my main DataFrame with a separate static dictionary comprised of the Champion ID and it’s associated Class (This can be found in the Static Data of the API and won’t count towards your overall request limit). Now we can dummy theses classes and move into graphing the data.

Now with the Classes converted into dummies, we can now get to modeling. (Dropping the Assassin Dummy to avoid Dummy traps. . .)

My first question was to see the distribution of kills. Working in python and seaborn I made a simple histogram. A rather healthy distribution that is positively left leaning, which makes sense. I notice a few valleys that suggest that players either don’t kill anyone and when they do it’s around 5–8 people. But overall this data isn’t really telling a story. Let’s go down deeper.

Jumping into Tableau we can now build a narrative and see who is doing all the killing. From my observations we can see a healthy relationship between support and tank classes versus the heavy hitters [Assassins, Fighters, Mages, Marksmen]. (You want your core DPS/Carries getting the kills because it yields more gold and experience, which should correlate to wins. In short you don’t want your support classes to be leaching away possible resources from your core players (However we can see a few stubborn Tanks and Supports in the middle of the histogram, probably kill thieves)

My next question was to see the average gold distributions.

An appealing ( depends on who you ask ) game feature in League of Legend’s is that you don’t get penalized for dying but you give your opponent an advantage when you do die.

A quick observation show that the big spenders are the Fighter,Marksmen and Assassins (Who are usually carries and need to spend more to complete larger item sets). Big earners seem to be Marksmen/Fighters as they occupy the outer right side more frequently. I assume having a skill set that focuses on damaged translates to a larger gold yields. Supports and tanks are in a solid middle area which suggest that players are fitting in there roles accordingly.

One thing that seems to stick out to me is the frequency of the Mage class in both spending and earning, which suggest that the role is more flexible in terms of item purchase compared to the tank and support role. Ideally you wouldn’t want supports buying high end items because it would be pointless and and you wouldn’t want your core players wasting time and money build utility items like wards.

Overall I’ve made some surface level models and haven’t dived nearly as deep into the data as I wanted too. Next week I’ll be exploring modeling capabilities to try and attempt some Logistic predictions. So far it’s been just reinforcing concepts and diving a just a little bit under the hood.

In regards to my raspberry pi 3 project, I’ve gotten the camera working. Now I just need to get OpenCV set up and I should be good to go! Neural networks here we come!!!

--

--