Stop waiting for someone to “give you a chance” to be a data scientist

The best data science — and the best innovation — has always been permissionless

Hamdan Azhar
Towards Data Science

--

For many years, it has been a commonly accepted mantra in tech circles that the best innovation is “permissionless” — that is, it is driven by outsiders working on the fringes of existing power structures and is more likely to take place in people’s garages rather in universities or large corporations.

The blockchain community has vigorously embraced the concept of permissionlessness as an all-encompassing ethos governing its core technologies as well as the type of society those technologies are intended to usher in.

As a data scientist who spent nearly a decade learning from Nick Spanos, who founded Bitcoin Center NYC, the world’s first physical bitcoin exchange, in 2013, it recently occurred to me that data science is one of those remarkable fields that is truly permissionless — and more so today than ever before.

Unlike rocket science or civil engineering, which are not only highly regulated fields but require large amounts of resources in the real world, data science today just requires a laptop, learning core statistics and computer science principles, and learning a statistical programming language such as R or Python — all of which can easily be self-taught.

The other ingredient is access to data, which is literally all around us, and can easily be obtained in unimaginable quantities from open data sets or scraped from the internet. (Of course, the truly secret ingredient is finding a good question to ask, but this can only be learned through relentless repetition and trial and error.)

Permissionlessness innovation is a state of mind 🧠

All of this occurred to me a few weeks ago when I heard a heart warming story from a young man in an interview. (Stories like this used to very common but I’m not sure what happened.) He had been working in analytics for a few years and then lost his job in a layoff. He tried very hard to find a job but he couldn’t — so he started his own company!

He hacked on it for 9 months, built an actual product, got clients, and even built up an entire offshore engineering team to sustain and iterate on the product. (Literally nothing on his resume would have led one to believe he had any of that in him — before he demonstrated that he had it in him all along.)

He also asked me hesitantly how a big company would view his entrepreneurial journey and I assured him that big companies are the ones in greatest need of entrepreneurial thinking because they realize that without innovating there is no way they can keep up with the competition — which is coming fast and furious from all directions.

Data scientists must have more to offer than just technical skills 📦

I only wish many of the aspiring data scientists whose resumes I routinely come across would realize this — what matters is not just what skills and abilities you have, but what you do with them is what matters most.

For example, anyone who has reviewed data science resumes will be very familiar with what I call “The Box” — almost every resume has one.

The Box is an impressive and intimidating “everything but the kitchen sink” list of tools and technologies that the applicant claims to possess (see below for some examples from actual resumes).

The problem is only a tiny percentage of these resumes provide any meaningful examples of what the applicant has built in the real world using these tools.

The tools and technologies you know are only a means to an end — and without the end, they are an indictment of your unreached potential. (It’s also ironic given that the vast majority of day-to-day data science and analytics in large tech companies requires just three tools: SQL, R, and Python.)

This is why the most important question I ever ask in a data science interview is “what have you built with data?” or alternatively “how have you used your data skills to drive impact?”

Every aspiring data scientist should make sure they have solid experiences and answers to point to in response to this question, the more interesting the better — there is no other way to stand out amongst the thousands of graduates every year who are trying to get into “the sexiest job of the 21st century.”

Data is all around us 🗄️

We live in a world in which data is ubiquitous and widespread and almost entirely free. We literally live in a world that is bursting at the seams with data. Not only that, the tools necessary to analyze this data are widely accessible and free. (This is no longer the 1990s when you needed an expensive SAS or SPSS or Minitab subscription — R or Python are completely free and far more useful than all three of those combined.)

This means that the barrier to entry to data science and data analytics in some sense has never been lower, but ironically because of that reason, it is now actually quite high!

By this, I mean, the job market has been flooded by candidates with solely technical skills and a desire to climb the Silicon Valley ladder but no drive or intuition or innate curiosity or ability to think outside the box.

So many of them are still complaining about how “every position requires X years of experience” without realizing that, unlike in rocket science or bridge building, there is literally nothing at all preventing them from gaining experience in data science or data analytics. Data is everywhere and the tools are all free — what else do you need?!

Just like the young man I interviewed who saw no reason to wait for some company to “give him the chance to manage a product”, instead he built his own product and now he is on the pathway to interacting with those same companies on an equal playing field, not as someone supplicating for a chance to prove themself, but as a peer technologist who has also created something of value in the real world.

Be “creation-oriented” and not just “career-oriented” 👗👔

One reason, I believe, that we see less of this now than we used to back in the day is because technology, and data science in particular, has become a stable field that naturally attracts more career-oriented people rather than creation-oriented people.

These career-oriented people can only imagine coloring within the lines and only coloring when they’re given permission to color. Even when the skillset they have enables them to answer an infinitely vast array of questions, they remain focused on the most boring questions, lacking even the faintest attempt at creativity. For them, their skillset is solely a means to an end, and that end is a 9-to-5 job, so without that job, how can they even think of using their skillset to add value to the world?

Meanwhile, for the entrepreneurial free thinkers who first entered these fields before they became stable careers, the career, and even the job, was always incidental to doing what they loved to do — which was creating, building, and using technology to understand the world.

The best way to become a data scientist is by doing data science

My advice to new and aspiring data scientists — whether you are a recent graduate or are working at a startup or even at a big company — is the same.

You don’t need to wait for anyone to “give you a chance” to become a data scientist. The best — and only — way to become a data scientist is by doing data science.

It’s by searching deeply within yourself and finding the intersection between your unique skills and abilities and your passions and purpose and figuring out what is the unique contribution that only you can make in the world. (I alluded to this in my TEDx talk at NYU in 2017.)

For most people, it’s not about entering Kaggle competitions and trying to improve on someone else’s model by 0.1%. Rather, it’s about asking novel questions, turning raw, messy data into curated datasets, and finding insights and patterns in that data that are valuable to someone — even if that’s just one other person.

Data scientists are explorers — so let’s go out and explore.

HAMDAN AZHAR is a data science, analytics, and research leader with over 10 years of experience discovering meaningful insights in complex datasets and using storytelling to drive business, product, and social impact. The founder of PRISMOJI, he was formerly a director at Blockchain Technologies Corporation, and prior to that, was a data scientist at Facebook, at GraphScience, and on Ron Paul’s 2012 presidential campaign.

--

--

Data Scientist, Writer, Storyteller | Founder @ PRISMOJI | Ex Facebook Researcher | Co-inventor of emoji machine learning 🤓 hamdanazhar.com