Like many, it’s my evening routine to watch the Stephen Colbert show along with a drink. It is a perfect ode to the day. Stephen manages to makes me laugh even on some of the worst days.

The show has two parts. Stephen opens with a monologue and commentary about that day’s political events. In the second part Stephen talks to guests. I have seen people from different walks of life in Late Show, from Presidents to a bear trainer. This diversity of guests is what got me curious about the demographics of people who appear in Stephen Colbert‘s show.
What do most of them do for a living? Are they relatively young or old? Does the infamous gender gap exist in one of the most famous liberal shows ? Are there other patterns that I can spot?
The data wasn’t readily available. So I decided to scrape Wikipedia. I have collected last 3 years of data from the show’s beginning. It had 641 episodes and 993 guests till November 2nd, 2018.
Scraping and cleaning crowdsourced data wasn’t easy, but the insights I got was well worth it. For example, few would have guessed that most frequented guests in a primetime show are a competitor talk show host and an astrophysicist? Yes. John Oliver and Neil deGrasse Tyson appeared 9 and 8 times respectively.

Now time for serious talk. Some of you might shrug and ask; We already know that Stephen invites the crème of American society for his late-night banter but is it important to do number crunching? The answer is: Yes. The explanation is more to do with the way we consume content in these times and less specific to the show we are analyzing.
Impartial analysis of the regular content we consume matters more than ever because of the filter bubble we are in.
Filter bubble is a state of intellectual isolation that can result from personalized searches . In the age of post-truth, we hear over and over again the truth which we would like to hear. Also, content makers tend to manufacture political dichotomies to grow the audience in an already divided nation.
With millions of Americans watching The Late Show with Stephen Colbert every night, the show has an unparalleled influence. There are academic studies on the Stephen’s real-world impact. Colbert Bump is a term coined by the social scientist James H. Fowler. It means an increase in popularity of a person or thing as a result of appearing as a guest on or being mentioned on the show. In Fowler’s research, it is found that contributions to Democratic politicians rose 40% for 30 days after an appearance on the show! Although this research was about Colbert Report -Stephen Colbert’s previous show – its validity remains.
Heck, he could even raise a million dollar real world Super PAC. So the show deserves a critical analysis.
Now on to the numbers and details.
Hollywood , this way please

In line with similar late night shows, guests are predominantly from entertainment and show business segment which contributes to 76% of the total guests. This block is an assortment of actors, rappers, singers, musicians, filmmakers, scriptwriters and of course stand up comedians.
The quarter which is left out is where we can see the heterogeneous nature of the show. From YouTube personalities to four-star generals to a Catholic Bishop, the quartile is a real mixture. Under the obscured groupings there are rare gems whom you don’t usually meet in the Late Night TV interview circuit. Like Stephen King and Andrew Sullivan in the Journalists and Writers group or like theoretical physicist Brian Greene and linguist John McWhorter from Science and Academics group.
Also with 17% aggregate of Government,Politics and Military, Journalists and Writers, and TV Journalists and hosts occupation __ groups, the show is clear about its priority and character even in its second segment which is normally dedicated for entertainment.
The Bechdel test

Numbers speak loud and clear. Liberal shows are not free from the gender gap. Gender gap in the show is 30 percentage points in favor of men. This figure is significant, considering majority of Stephen’s guests are from show business which is a field where historically female talent is on par with the male. If you analyze occupation categories along with the gender division, the contrast is even more pronounced.
For example, in the Entertainment,Show Business occupation category, the division between male and female guests are 62.2% and 37.7 % . When we move to other supposedly ‘serious’ professions like science and academics, the percentage splits gets wider with 75% and 25%. In Government, Political and Military occupation block, the gender ratios are 71% and 29% in favor of men. However, Business and Corporate takes the podium when it comes to the category with widest gender gap of the guests which has a staggering 94% male representation beating even sports which has 16% female guests.
Stephen likes middle aged men

The average age of the guests is 47 years with a standard deviation of 13.56 which indicates the distribution of age. If we split across the gender, average age of female guests is 45 years and 50 for males. This difference in average age between gender has a strong correlation with the professional grouping.
For example, in the above graph, we can see the only age range where female guest count is higher than men is 28 to 36. Further diving in to the data , I found that 94% of those female guests who are 28 to 36 old are from Entertainment and Show business group. On a comparative snapshot , male guests from 28 to 36 ages has a much more wider spread across the occupational spectrum. See the chart below.

Let’s talk politics

60% of the guests – from around the world – who have their Wikipedia page political party field updated are Democrats. If you take a U.S-only bi-party view, the numbers get more interesting. Out of 71 American guests whose political views are marked as either Republican or Democrat, there are three times more Democrats than Republicans. These numbers are in line with the general perception that the show is getting increasingly political after Trump’s inauguration in 2016.
All the lovely people,where do they all come from?

Plotting birth cities and states of American guests revealed some interesting patterns. Guests come from all over the country but there are outliers. Around 40% of the guests are born in two states, California and New York.
A bivariate analysis which showed that 31% of the total guests who are in Entertainment and Show Business category __ born in two states.In contrast, the percentage of the birth state of the political guests are distributed more widely. Except for the state of New York – which I am considering as an outlier – the ratios range from 5.77% to 1.92% over 21 states. Here we sniff a trend that growing up in important metropolises matter for certain occupations with fewer or early life entry points.
Big Apple shines through

I dug a little deep on the variability of birth locations. This time I drilled down to cities. The prominence of New York City as a spatial factor is very evident. The above graph shows that New York City has 47.7% guest share in the top 10 US cities.
The signal we got in the last section – the place where people grow up can affect their profession – gets stronger when we align specific professions and cities. This correlates with other evidence and trends from the data like Juilliard School which is one most prestigious art school located in New York city is the highest attended school in the guest corpus.
All the professions, like acting, singing, and writing, for which an early nudge can have an unfair advantage shows a correlation with the birth city. New York has a 55% share in the group TV Journalists and Hosts compared to other cities. 57% of the Entertainment and Show business group and 69% of the Journalists and Writers group __ are born in New York city.
TL; DR?

- Most of Stephen’s guests are from show business but there are other cool people too
- When it comes to politics its not a secret that Stephen tends to prefer Donkeys to Elephants
- More women from diverse fields will make this awesome show an essential one