The world’s leading publication for data science, AI, and ML professionals.

Using Python to Refine Your Spotify Recommendations

How to use the Spotify API and spotipy library to filter Spotify's recommendations.

Photo by sgcdesignco on Unsplash
Photo by sgcdesignco on Unsplash

Music plays a large role in many people’s lives, it’s hard to imagine a day without it. I, like many others mainly consume it through streaming on Spotify. If you ever scrolled down to the end of your playlist you have met Spotify’s recommendations. Once in a while they seem spot on, at other times I wonder whether the recommender system took a wrong turn somewhere. But maybe that’s just me and my diverse taste in music.

If you are satisfied with your recommendations you can still use this article to learn how to use the Spotify API with spotipy and how to successfully go through the authentication process.

Setup

Image by Author
Image by Author

The Spotify API enables you to get user Data or recommendations, create or modify playlists, and much more. To use it you need to go through an initial setup with the following steps:

  1. Login/Register as a Spotify Developer
  2. Create an app.
  3. Go to the app dashboard select edit settings and set the dashboard address as the Redirect URI.
  4. View the Client ID and Client Secret. Store this information somewhere safe. Don’t copy and paste it into your notebook/script and publish it on GitHub.

Authentication and using Spotipy

After this initial setup, you will have to generate an authentification token with the Python library spotipy.

This can be done by providing some info in the code below. You need to enter your username, Client ID, Client Secret and Redirect URI, which you find when opening your app dashboard.

You also need to set a scope of authorizations. For our purposes, the scope needs to contain at least "playlist-modify-public". If you want to explore your data or do other things you need to add more to your scope. To view all available scopes click here.

Initially, you may get a prompt asking you to paste the URI you were directed to. You only need to do this once as long as you don’t change scopes or usernames. After successfully getting the Spotipy authentication token you can use the API to start exploring and modifying your Spotify playlists.

If you want to do more than just get some new songs into your playlist once, I recommend writing a small function with the above steps to shorten the authentification process.

Loading your playlist and getting recommendations

Photo by Sara Kurfeß on Unsplash
Photo by Sara Kurfeß on Unsplash

When you query any data from the API, it will return a pretty large and complex dictionary. These dictionaries also vary a little in structure depending on whether you query saved songs, playlists, recently played tracks, or artists.

To make life easier I suggest figuring out the structure and then writing a function to extract the relevant data from the dictionary to store it in a DataFrame. The next code block shows the create_df_playlist() function which returns a DataFrame from the sp.playlist()-query results and the append_audio_features() function which appends Spotify’s audio features to all songs in a given DataFrame. These are two functions we will use in our script.

I wrote a couple of functions while working with the Spotify API which might be helpful if you want to explore it further. You can find all of my functions here. Please note that they are still subject to change. Any help or constructive feedback is also very welcome.

Now let’s get to the fun stuff.

In the code below I query playlist data from the API and create a DataFrame from it. I then create a list of all tracks from the playlist to be used as "seed tracks" by Spotify’s recommendation.

As the API does not allow a long list of seed tracks I divided them up into "packages" of five tracks. These packages are used to get 25 recommended tracks with audio features. The process is repeated until we got Recommendations for every song.

Now that we have a lot of recommendations it is time to filter them a little bit more.

There are a couple of approaches that could be used for such a task. I opted for calculating similarity scores between the audio features and selecting the most similar songs based on these scores. Additionally, I also filtered the recommendations again by how similar they were to a "mean song" (the average of all audio features in the playlist DataFrame). I’m not going to discuss this here, as it’s just a small step from everything presented and only works well for playlists with songs of a similar sound/genre.

The similarity score is calculated through the function below. Note that I am only using the audio features to determine similarity. You could also include popularity or duration if you wanted.

I found that using cosine similarity works pretty well, however I encourage you to try out other methods, maybe you find an even better one.

After creating the similarity matrix through the function above, we end up with a matrix that contains one row for each song in our original playlist and one column for each recommended song. I then used np.argmax() on each row to retrieve the index of the recommended song that is most similar to the respective song in the playlist. The recommendations DataFrame is filtered down through the use of these indices. As the last step, I check whether any recommended tracks are already in the playlist and reset the index.

We now have a DataFrame with recommended songs that can either be filtered more, sampled from, or just added to the playlist.

To add to a playlist we simply use user_playlist_add_tracks()from spotipy. The function requires your username, the playlist URI and a list of track ids that should be added to the playlist.

And that’s it!

I encourage you to explore spotipy and the Spotify API more. You could find out which artists you listen to most, which songs are your all-time favorites, or build more playlists for yourself and others.

Enjoy listening to your playlist, whilst coding or otherwise!

-Merlin


Related Articles