The world’s leading publication for data science, AI, and ML professionals.

He Dies, She Sucks

An Analysis of Gender and Language in Songs on Spotify

Photo by Eric Nopanen on Unsplash
Photo by Eric Nopanen on Unsplash

He dies. She fine. He runnin’. She suckin’.

Men and women are portrayed very differently in entertainment. The way they speak, how they are portrayed, and how they interact with each other are all informed by Gender.

A striking example of this has to do with the big screen. Back in 2017, Julia Silge, Russell Goldenberg, Amber Thomas, and Hanah Anderson broke down stage directions by gender for over 2,000 Hollywood scripts. They found that women are far more likely to be instructed to snuggle, giggle, and squeal, whereas men are far more likely to be told to strap, gallop, and howl.

Such discrepancies, in stage directions and beyond, shape views both consciously and unconsciously, often reinforcing stereotypes and existing power structures. Given the significance of this implication, I decided to conduct a similar analysis looking at song Lyrics.


To do so, I used the Spotify top 200 list, which contains the top 200 songs by number of streams each day going back to 2017. I obtained lyrics for over 90 percent of these songs using the Genius API, and I manually obtained the gender for over 95 percent of the artists (around 3 percent were bands with both men and women).

The first aspect of songs I wanted to explore was representation in terms of the relative use of "she" and "he". I was initially surprised to see that the word "she" is used considerably more often than "he."

Image by Author
Image by Author

I had expected parity, or even a disproportionate use of the word "he" given the oft cited issue of male over-representation in the media and entertainment.

However, the finding stands to reason in the context of songs, which often center romance and lust towards the opposite gender. We can actually see this in the data quite clearly: men are much more likely to use the word "she" (80% of the time) whereas women use it less than half the time (47%).

This appears as a far higher prevalence of the word "she" because men make up a vast majority of artists on the top 200 list. Since 2017, only 16 percent of artists on the list have been women. This is where the over-representation of men in media rears its head.

Image by Author
Image by Author

However, my interest was not to shed light on over-representation of men in popular music. Instead, I wanted to see how men and women are written about in the songs themselves. To do so, I focus on the words that come after "he" and "she."

Doing this in absolute terms without removing so-called "stopwords" is rather uninformative. Words like got, a, and said dominate the list for both "he" and "she".

After removing these uninformative words, a more interesting picture appears. Words like told, left, and dies were most common for "he," making up ~3.5%, 2%, and 2% of words associated with "he" respectively. Love, bad, and told top the list for "she" (4%, 3%, 2.5%).

This is interesting on its own. However, a more insightful method is the one pioneered by Silge: to look at relative likelihood. For example, if the word dies appears next to "he" 3 times and next to "she" 1 time in a song, we would say "he dies" is 3 times more prevalent than "she dies."

Doing this demonstrates where I got the inspiration for the title:

Image by Author
Image by Author

Compared to women, men are over 32 times more likely to die, be talkin’, or be dead. They are also much more likely to sleep, run, and make. Women, on the other hand, are around 16 times more likely to be bad, to f*ck, and to love. They are also more likely to suck/be suckin’, ride, and call.

This is true for other gendered pronouns as well. For example, words before "him" are generally more violent (shoot, kill, pop) whereas words before "her" are generally sexual or related to money (bought, f*cked, touch). There are some notable exceptions. For example, "praise him" is around 32 times more likely to appear in a song than "praise her", and "beat her" is over 10 times more likely to appear than "beat him".

Image by Author
Image by Author

It is a similar story for "his" and "her." While, when compared to the words associated with other male pronouns, the words following "his" in songs are less violent and more possessive (money, time, boo), the female pronouns are once again hyper-sexual, with words like p*ssy, knees, tongue, and even toes topping the list.

Image by Author
Image by Author

Such findings hardly require analysis. As most would expect, women are sexualized and men are either characterized as more violent or are more likely to have violence inflicted upon them.

An interesting wrinkle in the data is how these relative likelihoods differ for male and female artists.

Unsurprisingly, given the prevalence of men on the top 200 list, focusing on them mostly reflects the aggregate findings. That said, it is worth pointing out that women become even more sexualized and men become even more subject to violence when only looking at songs by male artists.

Image by Author
Image by Author

This is starkly different from female artists. Words like die and dead disappear for "he", and words like suck and ridin‘ disappear for "she." In their place are words like sweet and sleep for men and bad and swag for women. In other words, women are less likely to sexualize themselves and less likely to associate men with violence.

Image by Author
Image by Author

Ultimately, my findings are nuanced. It is true that in popular music women are associated with physicality and sex and men are associated with violence and possessions. However, there is heterogeneity across artist gender. The top-line findings are driven by men, who currently make up a vast majority of artists on the top 200 list. Women, on the other hand, sing very differently about men and fellow women: less violently and less sexually.

Indeed, all of this leads to a similar conclusion to that drawn by Silge – if and when female artists begin to make up a more proportional share of artists on the top 200 list, we may see fewer songs where men die and women suck, and more songs where men give and women drip.


Related Articles