Part 1 of Riot API: Data Downpour

I’ve completed one month of my DSI (Data Science Immersive) at General Assembly and have introduce a few models ( Logistic, Linear, KNN, and SVM) within my tool kit. But I’ve just scratched the surface and I’m still not over the entire “black box” feeling. A bit of skepticism is an ideal trait in this industry but one could get lost within the multiple levels of abstraction within each model. But for now let’s get our hands dirty with some data.

This week at General Assembly we played around with web scrapping, which was a great time to rest after two weeks of non stop machine learning concepts. And by rest I mean more time to reinforce those concepts again and again.

But over this week I took time to learn my first API. After reading over the documentation and posting on the forums I managed to get a solid DataFrame going from the raw JSON request.

What’s League of Legends? What is a MOBA?

In short you have two team of 5 people each embodying a wide selection of avatars/characters that fall under several roles.

Tank: Gets hit and soaks up damage

Utility: Stops and interrupts things.

Support: Usually a healer but provides more utility.

Carry: The role that does all the damage. Everyone pools together to make sure this person gets strong as possible.

You start in a standard 3x3 lane (Top, Middle, and Bottom)

Who even destroys the enemies base first, wins. Each franchise has their own flavor to this set up but over all the format is the same.

Solo mid or feed!

Again I recommend reading the entire documentation and agreements before you do anything but here are a few major caveats that I found during my tinkering.

  1. Given that you “made” and “activated” (via the Lol launcher), the first thing you should do is find your API key and keep it safe. You can request a new one if something happens but it’s common sense to not reveal your key on a Git project that is public or a form post. (Apparently this happens a lot)

2. Keep in mind how many request you make. Be smart with your request calls because there is a cap.

  • 100 calls per 1 second
  • 1,000 calls per 10 seconds
  • 60,000 calls per 10 minutes (600 seconds)
  • 360,000 calls per 1 hour (3600 seconds)

time.sleep(1) is a good work around for python users.

3. Cache your data to cut down on request calls. This echos point 2, and I used this technique when I was scrapping for table data within layers and layers of pages. Working point by point rather than from top to bottom saves you so much time.

Now lets dive into the data. I’m currently using Python 3.5.2, my libraries are pretty standard.

  • Pandas
  • Numpy
  • pprint
  • json
  • ppring

My first request function calls for a specific player and takes in a user input string. Bellow is a general idea of what my base code looks like. I’ll be following the player Doublelift, just because he’s been playing a lot this season and plays mostly “carry” roles. I’ll spare you the description but just think of this role as the main play maker (usually) like a quarter back in football or power forward in basketball.

#Global var
summoner_id = "
def request(name):
print("Request Done")
URL = "{}by-name/{}?api_key={}".format(summoner_id,name,key)
response = requests.get(URL)
return response.json()
#One of the top Lol players right now is called Doublelift
# Request yields his info
"profileIconId": 1467,
"name": "Doublelift",
"summonerLevel": 30,
"accountId": 32971449,
"id": 20132258,
"revisionDate": 1492316460000
def main():
player_match_df_list = {}
name = input('Name: ')
raw = request(name)
player_id = (raw[name.lower()]['id'])
# Grabs 20132258 and puts it into match_list function
# which takes the match id and grabs a JSON list of that specific match.
    m_list = match_list(player_id)
return m_list
#This yields a list of last 10 games played (This varies between player and Season)

You can now use the “id” to access lower levels of data.

PlayerId MatchList Match Detail

At this point I cached the data into a nested dictionary. Now I can make and break till my hearts content. And after playing with the raw JSON I was able to make a clean DataFrame to work with.

Into a beautiful clean data frames

Let’s see how I did with the official match history by comparing K/D/A (Kill/Death/Assist)

Not bad,things seem to match up quite nicely. My python script managed to grab up to 10 matches so far and stored each DataFrame into a dictionary.

Now with a working and reliable data set in the works I can move onto exploring and modeling the data. This is where the real fun begins. So expect some seaborn graph’s next week, as we dive into the data.