The world’s leading publication for data science, AI, and ML professionals.

Why Every Data Scientist Should Have Imposter Thoughts

Imposter syndrome is when someone consistently doubts their abilities despite success or achievements. On the opposite spectrum is the…

Self-doubt can have some benefits

Imposter syndrome is when someone consistently doubts their abilities despite success or achievements. On the opposite spectrum is the Dunning-Kruger effect in which low ability people have outsized confidence in their abilities. Finding the balance between confidence and skill is important, but there are good reasons to sometimes have imposter thoughts (rather than full imposter syndrome). Basima Tewfik performed studies on physicians, investment analysts, and ROTC cadets and found that imposter thoughts helped motivate people for skill mastery and improved interpersonal performance at work. The overall results were mixed, but found these positive use cases for imposter thoughts.

Created with networkD3 package in R. Image by author
Created with networkD3 package in R. Image by author

As for data scientists it’s easy to compare oneself to others and think that despite learning programming, statistics, and more that you are not worthy of the title because others do so much more. And if you have those thoughts you may be correct.

Screenshot image by author
Screenshot image by author

There are (as of March 2021) 119,000 articles tagged with Data Science on Medium. Previous analyses show an average of 3.3 minutes per article yielding about 272 days worth of data science reading material on Medium alone. That’s a bit outlandish because no one in their job field reads all articles, but data science is somewhat different because specialization isn’t necessarily a benefit to one’s data science career.

You should have imposter thoughts about your breadth of knowledge from time to time; otherwise your knowledge could be out of date.

In the past year I’ve focused on model explainability techniques and model bias analysis. Most of the relevant research is less than 5 years old and most of the useful open source packages are less than 3 years old and are being updated multiple times a year. So despite taking graduate school classes in 2017 I had limited knowledge of model explainability other than linear regression coefficients and overall variable importance graphs because the literature was scarce at the time. To counteract my imposter thoughts and doubt I did the following:

  • Took time to read articles and enjoy the process of learning new concepts (Adam Grant discusses the joy of being wrong in his book Think Again)
  • Started talking with other colleagues about it and forming a regular meeting to discuss anything about data science that wasn’t necessarily related to a specific project
  • Once finishing an analysis on a model’s bias we concluded the report evaluated bias to the best of our current abilities, but that we would continue to evaluate bias over time as new literature appeared and our abilities improved

The knowledge base of data science grows quickly and the tool base of data science grows quickly. Many aspiring data scientists ask "What programming language should I learn for data science?" While python, R, and SQL have remained top programming languages the past decade there are many others that may need to be part of your toolkit. You may need to learn ETL tools, pipeline tools, machine learning specific programming syntax, big data tools, or visualization software. Some companies are "Google shops", "Microsoft shops", "AWS shops", or another type of "shop" in which most of the tools and products used are from one company.

You should have imposter thoughts about your data science skills; otherwise you might have skills in obsolete tools.

At one company when I heard we were adopting Hadoop I volunteered to work on extra projects with it. Part of my motivation was out of genuine interest and the other part out of fear. I knew that if I didn’t have that skill my prospects of future promotion or other jobs were limited. The same process happened when we adopted Teradata Aster. The imposter thoughts helped fuel my motivation to learn new skills that ultimately helped land me a job as a data scientist. Lately I’m afraid I haven’t done enough to learn new skills that could become relevant to my work in the upcoming future.

Imposter thoughts can be used for motivation and can be used for self and overall evaluation. An imposter thought of "My work doesn’t matter" could be transformed into "What work would matter?" or "How can I best spend my time?"

You should have imposter thoughts about the importance of your work; otherwise you may not be prioritizing the most important work.

I like hearing about the latest advances in image processing neural networks, but most of my work hasn’t involved any images. Sometimes I feel like a failure for being a data scientist and not being great at machine learning with image processing and learning the latest developments. When I think those thoughts I do the following:

  • I acknowledge that is not the focus of my current work
  • I define the other work that I am prioritizing ahead of image processing
  • I leave open the possibility of pursuing it in the future

Instead of learning image processing some of the things I’ve focused on are:

  • Being able to explain data science work to people without a data science background
  • Showing the ROI of data science projects and how to compare them against alternative benchmarks

It can be frustrating to judge improvement because these are soft skills, but they are more important to the current business.

If you have ever felt like an imposter in data science, you are not alone. Try using those thoughts for motivation while keeping a balance from letting imposter thoughts hurt your productivity and mental well being.

This article was inspired by: Armchair Expert Podcast episode with Adam Grant Mindset by Carol Dweck

Footnote: An alternative is to learn COBOL with zero doubts and trust that systems are slow to change. This strategy could work.


Related Articles