Introduction:
A few years ago, I used to take my corgi to the dog park where she’d chase tennis balls with a pair of German Short Haired Pointers (Fast Dogs). Surprisingly, she would get the ball about a third of the time. Both dogs could run circles around her but they were easily distracted. An extra five seconds of focus was all she needed to beat dogs twice as fast.
Any career in data requires continuous learning and it is easy to get distracted. The sheer number of subjects a data jockey has to learn is simply overwhelming:
- Should I focus on Python, or maybe getting better at visualization using Tableau is more important?
- But being able to put code into production is super important; so I should definitely learn Apache Airflow.
- On the other hand, I can really do a lot by deepening my knowledge of Random Forests.
- Few employers use models much beyond regression analysis so maybe that’s where my focus should go.
- On the other hand, data governance at my company is a huge mess right now, maybe I should spend my time getting a deeper understanding of that.
- Pandas 1.2 came out recently- now is a great time to finally figure out what that SettingWithCopy warning that Pandas throws at me in my Jupyter notebooks.
With literally hundreds of legitimate subjects to be mastered it’s very easy to feel like you have to react to every new bit of tech that comes out. I genuinely believe that a modern-day superpower is just being able to keep focus and prioritize what’s important. Just like my corgi, having that extra focus can let you outcompete people who are faster, smarter, and more experienced than you are.
What really matters:
The cornerstone of an effective educational plan is knowing roughly where you want to be in the vast ecosystem of data-related jobs. Once you know where you want to go, you can begin to set a curriculum that will prevent you from chasing every shiny new piece of tech that comes your way and helps you get off the seemingly endless treadmill of half-completed courses.
If you don’t know exactly what type of data career you want, that’s OK. Focus on the fundamentals of the data tech stack:
- Spreadsheets
- SQL
- A scripting language (R or Python)
- A visualization tool (Tableau or Power BI)
Those skills will serve you well regardless of which direction your career takes you. Avoid chasing fancy new technologies that won’t be immediately beneficial for you and could be obsolete in a year or two.
Separating the wheat from the chaff:
Write down every skill you want to learn, along with why, and how it relates to your goal. This step really allows you to focus on the 4–6 things you need to work on. Some of the items you prune from your plan will be painful because there are so many cool technologies.
I spent years wanting to learn D3 and making a few half-assed attempts to learn it. The problem is learning D3 (and JavaScript) didn’t fit particularly well into my career goals. I wanted to learn it because it was cool. While it was painful giving up a cool piece of tech, my Python skills have improved exponentially because I haven’t been as distracted.
Don’t go it alone:
I’m setting my career trajectory towards engineering. Those are the types of problems I love solving. Once I realized that’s what I wanted, I started talking to the self-taught engineers at my company to see how they did it. They were able to turn me onto a couple of resources that have really influenced my learning plan. Identify mentors who can help you figure out what you should focus on. Try to identify mentors with a similar background to you. If you want to go into machine learning and you don’t come from an academic background, someone with a Ph.D. in Computational Linguistics won’t have the most relevant advice for how to fill in your knowledge gaps.
Build a curriculum:
Using the information from your mentors, start to build a curriculum. Log the curriculum in a spreadsheet and rank each subject’s importance. The key part of using this spreadsheet is you are only allowed to update it once to twice a year. Once you’ve set the curriculum, you are only allowed to invest time learning subjects in the spreadsheet.
I actually maintain two curriculums. One is long-term and is based on my career goals. I also maintain a short-term one that focuses on skills that I need to do my job. In my case, all my long-term goals align with work, but not all the skills I need for work align with my long-term goals.
Approved educational sources:
It’s incredibly easy to spend a lot of money on Continuing Education. To prioritize spending I also keep a list of approved educational sources. If a potential new source comes up and it isn’t on the list, I log it so I can research it the next time I’m editing my curriculum.
I strongly recommend you see what educational resources your local library gives you access to. My library, the Seattle Public Library, gets me access to LinkedIn Learning and the O’Reilly educational platform. You won’t know what your library offers until you look, but most library systems in North America have access to some online learning resources.
Theory, Praxis, and Process:
When designing your curriculum, you want to balance theory with praxis (A fancy way of saying practical applications). Many online resources over emphasize praxis, but you want to make sure you know enough theory that you conceptually understand what you’re working on.
I’d also encourage you to spend some time investing in learning better processes, which is different than both theory and praxis. I am currently learning how to use Vim, as a way to make myself a more efficient programmer. I also recently learned how to take notes more efficiently. None of those directly relate to my tech skills but do dramatically improve the process around both learning and working.
Other process-type skills can include mastering Git, learning the ins and outs of Jupyter notebooks, or configuring visual studio code with the best extensions.
Wide and Shallow, or Narrow and Deep?
There are two ways to approach building out your skillset, learn new skills to build a wide skill base or deepen your understanding of existing skills. There are pros and cons to each, a wide but shallow skill set will let you solve many different types of relatively simple problems. Wide and shallow is a good approach for the early stages in a data career. Many entry-level roles are more generalist in nature and exposing yourself to a diverse array of problems gives you a sense of how you want to specialize.
As you transition from entry-level roles into associate/mid-career roles you should shift to embrace a narrow/deep mentality. At this point, you know what you like, and what you’re good at-and deeper experience lets you work on more complex problems.
The best strategy is a mixed one. You need to know enough about core data technologies that you aren’t helpless outside your domain of expertise, but being over-generalized limits your value and career horizons.
Conclusion:
Adopting this strategy dramatically improved my educational efficiency. Rather than continuously chasing cool technologies, I am in control. In a chaotic ever-evolving field, maintaining clarity and focus has helped me craft the kind of career I want.
Starting a conversation:
What are some of the best resources you’ve found to further your education?
My favorite is: https://teachyourselfcs.com/ It’s a complete curriculum to help self-taught software engineers learn the theoretical foundations of computer science.
About:
Charles Mendelson is a Seattle-based Business Intelligence Analyst whose role is a hybrid between narrative analysis and data engineering. If you have any questions or want to get in touch with him, you can reach out to him on LinkedIn.
Originally published at https://charlesmendelson.com on July 11, 2021.