Become a Lyrical Genius

Scrape your Spotify playlist lyrics using the Spotify and Genius APIs.

Ravi Malde
Towards Data Science

--

Photo Credits: Malte Wingen (https://unsplash.com/photos/PDX_a_82obo)

Unfortunately I can’t rap or song-write and therefore I can’t teach you how to become that sort of lyrical genius. What I am good for is is showing you how to obtain song lyrics from your favourite Spotify playlists using the Spotify and Genius APIs along with a little bit of help from some web scraping. Many alternatives exist to the process I’ll be outlining here, however this was one of the few methods I found to be free and reliable. Some sections of code were inspired by an article written by Will Soares over at DEV Community, so please head here and show it some appreciation if you find this useful.

Below is the Python class I created, GetLyrics, which is ready for you to try out yourself. Just plug in your personal credentials for the Spotify and Genius APIs, and the Spotify user and playlist IDs (explanations of how to obtain all of these are given in the next section of this article) and you’re good to go.

As you can see from the code snippet above, we can get the entire playlist’s lyrics returned in a list, song_lyrics , with just two lines of code. Nice.

Obtaining Our Class Arguments

The first thing we’ll need to do is collect all of our inputs (the GetLyrics class arguments). There are 5 things we need in order to initialise the class:

  1. Spotify Client ID
  2. Spotify Client Secret
  3. Spotify User ID
  4. Spotify Playlist ID
  5. Genius Authorisation Key

Spotify Client Credentials

If you haven’t already registered then you’ll need to set-up a developer account with Spotify. Assuming you have a personal Spotify account, all you need to do is navigate to the Spotify developer dashboard and login with your Spotify account details.

Once logged into the dashboard you will then need to create an application. Even if you’re not actually creating an application, it’s required in order to get your API credentials. Give the application an exciting name and description and tick all of the boxes that apply for the “What are you building?” question.

The application should now live on your dashboard so let’s navigate to its page and take note of both the Client ID and Client Secret. These are your API credentials that will be needed when we make requests.

Spotify User & Playlist IDs

We still need a couple more things Spotify related. To tell the Spotify API which playlist we are interested in we need to get one Spotify URI for the user that created the playlist and one for the playlist itself. This is done by navigating to the user’s main page and to the playlist, clicking on the three dots and then clicking ‘Share’ — from there you can copy the Spotify URI to your clipboard.

Genius Authorisation Key

We’ll also need to create an account with Genius by clicking on “Authorise with Genius” at the Genius API documentation page. Once logged in, there should be an authorisation code on the right hand side of the page.

Breaking Down the Class

Now that we’ve got all of our inputs and hopefully you’ve been able to try out the GetLyrics class yourself, I’m going to provide a break down of each of the methods within the class so that you can use this code with confidence, and modify/improve it to suit your particular use case.

Initialising the Class Attributes

As with most classes in Python, we’re starting with a constructor method to initialise our class attributes so that they can be used in following methods.

Get Playlist Information

This method above connects to the Spotify API using the Spotipy library and returns a JSON object that contains heaps of information on the Spotify playlist we are interested in.

Get Track Names and Artists

In these two methods we are iterating through the JSON object to find the name and artist of each song in the playlist. These are then stored and returned in two lists to be used in later methods.

Connecting to the Genius API

This method uses the Requests library to connect with the Genius API using our authorisation key. It then checks to see if there are any matches in the API to the given track name and artist. It returns a response object that contains all of the information relating to these matches.

Sifting Through the Matches and Getting a Song URL

Here we are decoding the JSON object returned in the previous method and then checking which ‘hit’ is an exact match with the artist name. If a match is found then it means that the song exists in the Genius API and information on that track is now stored in the remote_song_info object. The Genius API does not contain each song’s lyrics, however it does contain a URL that routes to a webpage with the lyrics. In the snippet below we then parse the remote_song_info object to find each song’s URL.

Web Scraping the Lyrics

Finally we can now get our hands on some lyrics! This method uses the Requests library again to make a request to the song URL obtained in the previous method. We then use BeautifulSoup to parse the HTML. The reason we are calling the .find() function twice is because I found that the structure of the Genius webpages were sometimes formatted differently. Majority of the time the lyrics were contained in a div element with class="lyrics”, however on occasion it changed to class="Lyrics__Container...". I included a final elif condition where if both lyric objects were NoneTypes then the lyrics object would be set to None. This happens very rarely, but it’s due to a 404 error while making a request to a song’s URL, possibly because the URL no longer exists but has not yet been removed from the API.

Wrapping Everything Together

Lastly, this method ties it all together by executing each method in sequence, meaning we can get our lyrics with just two lines of code. It also has a series of print statements to make the method verbose, this way you can keep track of its progress as it executes.

I hope you have found this article useful. Feel free to use this code in your own work and if you have any suggestions for improvements I’d love to hear them down below. Thank you for reading!

--

--