Building the Dow Jones index for gender disparities in radiology

Kevin Seals
Towards Data Science
3 min readFeb 18, 2020

--

Modified from https://bit.ly/2SUhBed

The stream of data on Twitter provides a rich source of information that can help us understand trends and viewpoints in real-time.

One can easily harness Twitter data in an automated way and create large, structured datasets. The ability to do this feels intuitively powerful — what can we learn from the infinite patient/physician chatter? — and I spent quite some time pondering the highest yield use cases.

I concluded that a particularly meaningful application of Twitter data is understanding gender disparities in radiology and how they are changing with time. This makes sense because:

  • There is a powerful movement focused on correcting gender disparities in radiology, a field that has historically done a poor job of attracting and nurturing female talent. This discourse is organized around a set of discrete, established hashtags such as #RadXX and #WomenInRadiology.
  • This discussion represents a subset of the larger Twitter discussion around radiology in general, which is a robust conversation with thousands of daily tweets. We can thus consider some ratio of female empowerment radiology activity to radiology activity as a whole.

Using this reasoning, I decided to build the RadX Index (RXI), a Dow-Jones-like quantitative index providing a dynamic, real-time marker for how well we are doing at correcting the radiology gender disparity. You can check it out here: https://bit.ly/320uK9D

The simple version 1.0 works like this:

  1. All tweets including #RadXX are placed in a database with a timestamp and other key information, like username and tweet URL.
  2. All tweets including #Radiology are placed in a similar database.
  3. Every 24 hours, the number of tweets in each database (#RadXX and #Radiology) is quantified, and a ratio is taken to form the RXI.
  4. The RXI for that day is automatically plotted in classic stock ticker fashion, providing a public, graphical index of how we are doing and how that has changed with time.
Tweets containing #radiology organized in a database for analysis

We are effectively creating some automated approximation of the following:

This approach is simplistic and imperfect, but it is a start. There are many ways to make it more sophisticated, such as:

  • Sentiment analysis of responses to female versus male radiologists. What is the attitude towards females in radiology?
  • Quantification of the number of tweets from male versus female radiologists. Are a sufficient number of female radiologists joining the discussion?
  • Analysis of the degree of engagement around #RadXX tweets (e.g. views, likes, RTs). More engagement, higher RXI.
  • #RadXX and #Radiology are not necessarily the optimal hashtags for an RXI ratio. It may be better for example to combine #RadXX, #WomenInRadiology, and #HeForShe rather than using #RadXX alone.

These metrics can be integrated to create a more sophisticated master RXI score that is superior to the 1.0 iteration (that was built in a day to start generating data and discussion).

What do you think? I would love to make this index open-source and work together to improve it. My approach is a rough first step, but you are probably smarter than me…so, can you think of a better RXI implementation? Let’s discuss!

If you want to connect to chat about this project or anything else in healthcare/technology, reach out on Twitter or LinkedIn. And if you enjoyed the article, please share…discussion will make this idea much better!

--

--

Physician/engineer in Los Angeles, focused on using technology to improve healthcare. Corgi dad.