Data Curious 2017 Year in Review: My favourite data stories, datasets and visualisations from last year

Ben Dexter Cooley
Towards Data Science
9 min readJan 22, 2018

--

Hello again. I’m back from a lengthy Christmas holiday hiatus and ready to embrace the very best data-driven stories that 2018 has to offer.

But first, a retrospective.

Over eight months ago, I started a weekly post of my favourite charts, interactive data visualisations, data journalism pieces, quirky datasets and data analysis tools. And at first, it was only for me: I loved finding the stories each week (mainly through Twitter), and it served as a great reference point to inspire me for the work I was doing.

Fast-forward to 2018 and I’ve discovered there’s at least 2k others like me who love discovering new data things on the web. So to those who have joined me for the ride by following on Medium, a few things: a sincere thank you, sorry it’s been so long, I haven’t forgotten about you, and Happy New Year! 2018 is gonna be a good one.

Ok, formalities over. Now to the good stuff. Last year I (for the most part) stuck to a weekly routine of posting the best data thing I found on the web. The plan for a Data Curious 2018 is still in the works: but as a nice bookend to last year, I wanted to pullout some of the very best pieces featured in past editions of Data Curious. The best of 2017. You know the drill.

The very best reads

Let’s jump right in to the good stuff.

The NYT showed the world just how terrible Americans are at world geography by asking “can you find North Korea on a map?”. Even more interestingly, they showed this data alongside survey data of those preferring diplomacy to…well, other stuff. It would appear that better geographical skills = more preference towards diplomacy (but let’s not jump to causation just yet).

Full piece here

How do you draw a circle? is still one of my favourite interactive stories of 2017. A great example of starting with a curiosity (does drawing a circle say something about a person’s culture?) and then crowdsourcing the data to find out.

FiveThirtyEight published a beautiful analysis of 25 Years of American Death. Morbid, but beautiful.

Some impressive mapping from the NYT graphics team.

Can augmented not reality solve mobile visualization? Domik seems to think so. This medium post on how AR can please more personalised data viz had some really exciting ideas.

The Pudding published tons of amazing work this year. But this one seemed to stick with me more than others: analysing film scripts fo find gender stereotypes in cinema.

My favourite climate change interactive piece of the year goes to “You Fix It: Can You Stay Within the World’s Carbon Budget?” by the NYT. The piece allows the user to estimate carbon projections from now until 2100 in key countries and regions of the world. The combination of user input, changing visualisation on-scroll and projected line charts with different colours is a really effective combination.

Stories Behind a Line was groundbreaking in lots of ways, but especially for it’s masterful storytelling by weaving together the journey of six different refugees.

Bit of a weird one, but I still really love this project Giuseppe Sollazzo. Using Machine Learning, he calculated the average face type of a UK MP.

2017 will forever be remembered as the year every publisher took turns putting out a “the media got it wrong” op-ed after the 2016 elections. This take from Nate Silver of FiveThirtyEight stuck with me as the most reasonable analysis from a statistical perspective.

Wes Anderson’s dialogue + cinematography + machine learning = this incredible scrollable story of visual motifs across Anderson’s top four films. A combination of some of my favourite things.

The very best datasets / tools

It’s a new year, but that doesn’t mean you can’t find some great datasets from 2017 to play around with. Here’s some of the best to start with:

The Pudding’s spreadsheet of stories (datasets for each included most of the time).

And the folks at Tableau do a similar thing with their datasets from #MakeoverMonday challenges:

Instacart released a dataset with 3 million online orders from 200,000 anonymous users.

Here’s a dataset of Global Food Prices from the UN World Food Programme.

Every quarter, Congress is required to disclose all lobbying that happens, including what agencies were lobbied, what topics were covered, and how much income the lobbyist earned. You can download the dataset for the House of Reps here and for the Senate here.

How likely is your job to become automated in the future? This dataset of 702 SOC (Standard Occupational Classification) jobs, their likelihood of automation, and the number of jobs in each state could give some clues.

A public database from the Florida Department of Corrections of inmate tattoos.

The Manifesto Project has coded a central database of thousands of political manifestos from around the world. The data spans from 1945 to 2015, includes over 1000 political parties and covers over 50 countries.

A dataset of Bigfoot sightings. No, not a joke.

A list of over 19,000 restaurants and businesses that offer menu items containing “taco” or “burrito” in the U.S.

A record of every physical item checked out from the Seattle public library since 2005.

A website dedicated to “adding data to the debate” around the potential harms that Airbnb may be causing to housing markets.

Face-O-Matic is a software that crawls through TV news footage to analyse the amount of screen time given to Trump and various leaders of the U.S. House and Senate (all data downloadable).

Media Cloud is a very cool project from MIT and Harvard. It crawls through thousands of news sources to find keywords and topics at the story and sentence level.

The very best data visualisations

Charts, graphs, maps, networks: all my favourite data graphics from 2017.

This one from NPR, because it immediately begs a question: what is the U.S. doing all the way over there?

NPR

A fantastic gif chart from the FT that tells the story of the coup in Turkey.

The World Poverty Clock is hands-down one of my favourite live data tools of 2017.

Exploring emotions using data viz.

Explore full interactive from Ekmans

Loved this scrollable step chart from The Pudding on the Timing of Baby Making (data for the story downloadable here).

Another great gif chart: a whole story, in one graph.

2017 taught us many things, but one important one is this: We. Need. To. Normalise. Choropleth. Maps.

Spinning D3 globes from the FT.

A really great interactive here: users can place the Larsen C iceberg anywhere on the map to put its size in perspective.

Combo charts! Gotta love em! Map + slope chart (1 of 3).

Vertical line chart + quotes (2 of 3).

Bar + bubble chart (3 of 3).

One of the most impactful charts of the year for climate change:

After The Pudding used a rotated scatterplot in their piece on microbreweries in America, the Guardian did their own take and I love it.

Do you remember the solar eclipse of 2017? Do you also remember all of the brilliant spoof maps that popped up on Twitter of the eclipse path?

Do you also remember that whole crazy thing between Trump and the NFL? Yep. That really did happen.

I think my choice for most beautiful chart design goes to this investigative piece on how much the Brazilian government spends on federal barbecues. Combining fiery visuals and flames in a bar chart is genius.

That’s it! But really, I only scratched the surface. If you’re still looking for more inspiration, have a click through some of my previous editions of Data Curious. It was a fun year 2017 (sometimes). Here’s to more data stories, datasets and data visualisations in 2018.

If you appreciate this roundup, give it a few 👏️️ or share with your friends. I’d also love to see what you’ve been working on lately so get in touch.

--

--

Visualization Software Engineer @ Pattern (Broad Institute). Designer, developer, data artist. Portfolio: bendoesdataviz.com | Art: bdexter.com