Little boxes on the radio: Does modern pop music all sound just the same?

William Butler
Towards Data Science
7 min readOct 16, 2018

--

An analysis of Billboard Hot 100 songs from 1965 to 2015

I was interested in performing some basic musical analysis for my next project, so I snooped around for promising datasets on which to build. I ended up settling on this data array of the Billboard year-end Hot 100 songs, which gave a nice snapshot of popular music over a 50 year span (at least, music that was played on the radio in a given year) as well as a promising linguistic analysis. Although I originally intended to just explore the data (especially lyrics) to see how music has changed, I found a few pieces of evidence that point toward songs getting more similar over time.

After inspecting some of the songs, I found that not all of the lyrics were wholly accurate (or present), so I used geniusR to scrape lyrics from genius.com and fill in as many holes as possible. In addition, I took advantage of the truly excellent spotifyR package to grab some basic musical parameters about each track (tempo, key, and so forth) from Spotify’s massive database.

Song sentiment has become more ‘negative’ over time

For analysis of lyrical content, I chose to use a line-by-line measure of sentiment via the syuzhet package. I initially analyzed songs on a word-by-word basis, but because most words are neutral it didn’t give nearly as much insight as looking at lyrics within an entire line (summed across words). From there, it was relatively straightforward to examine song sentiment as a function of year:

Though overall sentiment remained greater than zero, there has been a clear (nearly perfectly linear) decrease in sentiment over time (linear model, p<0.0001). Digging a little deeper, I think we can attribute this decrease to multiple factors. First and foremost, curse words (from ‘ass’ and ‘damn’ on up) are all associated with negative sentiment, independent of their context (the sentence “he looked damn fine” is negative overall, even though contextually it could be interpreted as positive). It’s definitely an incredible understatement to say that hip hop has grown in popularity from 1965 to today, so I’d say it’s reasonable to attribute at least some of the increase in ‘negativity’ to the increased number of hip hop and rap songs in the Hot 100. In addition, though, I would argue that songs in general are somewhat more negative/less positive in other genres as well, independent of the vulgarity of the lyrics. For example compare these four ‘positive’ songs from decades past

with these more modern, overall ‘negative’ songs (in both cases, more positive sentiment → redder text and more negative sentiment→bluer text):

I’d argue there are simply fewer upbeat, happy rock, pop, and soul songs now than there were in the past (which, thankfully, also means fewer song lyrics like “Yummy yummy yummy/I’ve got love in my tummy”). You’re also a lot less likely to hear a folk poet like Cat Stevens or a funk/dance/Motown band like the Jackson Five on pop radio today; it’s not that current musical artists no longer exist in that genre, it may just be that that type of music is no longer popular/profitable enough to get a significant proportion of mainstream radio time.

Songs have become wordier over time

Further analyzing lyrics, there’s another obvious effect: modern songs have far more words than their 1960’s counterparts. To control for the fact that songs have gotten longer over time (perhaps partially due to advances in technology), I decided that the best measure to visualize this effect is words per minute (WPM).

Words per minute (WPM) for each song. Red line represents the median WPM for each year. The most notable outlier from the oldies, ‘I Like it Like That’ by the Dave Clark Five (1965) squeezes nearly 300 words into a 90 second song (annotated at far left). Even so, it’s still in a completely different league from rap songs like Ice Cube’s ‘Bop Gun (One Nation)’ .

An increase in WPM values is readily apparent, starting around 1990 and continuing to the present. Again, I don’t find this particularly surprising considering the timeline of hip hop’s popularity (for what it’s worth, ‘Ice Ice Baby’ (1990) was the first rap single to hit #1 on the Billboard charts). As denoted in the figure, the highest WPM values in the dataset are all hip hop tracks, and they were all released after 1990. Visualizing WPM as histograms in decade-long chunks (left) also makes this clear: starting in the mid-90’s and continuing on through to the present, a significant proportion of songs have WPM values that are much higher than those observed in the previous decades.

Additionally, the latest decades have seen smaller proportions of lyrically sparse songs reach the Billboard 100. In contrast to 1965–1995, when roughly 1/4 to 1/3 of the songs had fewer than 50 WPM, very few of the Billboard songs from 1995–2015 are so vocally sparse. As annotated in the earlier figure, the three tracks with the lowest WPM values were all released prior to 1985, and featured essentially no words at all (with the exception of ‘Soul Finger’, which featured the immortal lyrics: “Soul finger! Soul finger! Soul finger” shouted by neighborhood children).

Tempo over time

Moving on from lyrical content and sentiment analysis, I next delved into the selection of musical features that Spotify computes for each track. To start with, I examined tempo, measured by beats per minute (BPM).

Though we definitely don’t see changes as dramatic as those in lyrical sentiment or WPM, there are some ways in which song tempo has evolved over time.

Visualizing BPM with histograms across each decade(left) makes a few of these trends clearer. Most prominently, there appear to be three relatively distinct clusters of tempos in modern songs: ~90 BPM, ~ 120 BPM, and ~170 BPM. This is somewhat different from songs pre-1995, which formed one large, relatively continuous distribution around 120 BPM. Other research suggests that most hip-hop songs are around 90 BPM, while 120–130 BPM is the most ‘preferred tempo’. The chunk of songs from 150–200 BPM, on the other hand, may simply be songs that were mis-timed as double their actual BPM by Spotify’s algorithm (‘Hello’ by Adele, ‘Love Me Like You Do’ by Ellie Goulding, ‘Centuries’ by Fall Out Boy are all mistakenly identified as >150 BPM). Therefore, it may be more accurate to say that modern pop songs are more clearly split into two relatively distinct groups (slow-ish and “ideal” tempo) relative to older songs.

Modern songs are more homogeneous (on some measures)

Finally, I took a look at other musical features to see whether the common complaint about pop music is true and songs really are more similar to one another now. I found three musical features that appear to corroborate this theory: song duration, musical sentiment, and energy.

(Left) Billboard Hot 100 songs plotted by their duration (top), mean overall linguistic sentiment (middle), and overall energy (bottom). (Right) The standard deviations for each of these three measures has steadily decreased over time.

Each of these three measures has shown a significant decrease in their variance over time, which roughly speaking means that modern songs are more similar to one another than they used to be (at least in terms of duration, sentiment, and energy). This increase in similarity can also be visualized by creating dendrograms of songs by decade, clustered according to musical features and cut at a distance of 500. Here, again, we see that there’s a greater degree of similarity (fewer groups in the tree) in modern music than in songs from the past.

On a certain level, this also lends itself pretty easily to interpretation: music producers have learned over time what kinds of songs are most likely to be popular, and have directed their resources to making as many of those kinds of songs as possible. Depending on your perspective, this either means that music is going downhill overall (unlikely) or just that radio is less interesting now than it used to be (more likely). I certainly find it hard to imagine a 20+ minute instrumental monstrosity like ‘Tubular Bells’ being one of the 100 most popular songs of the year ever again. Whether that’s a good thing or not is a matter of interpretation.

--

--

Systems neuroscientist with an interest in analytics and data visualization