The world’s leading publication for data science, AI, and ML professionals.

R vs Python: What Should Beginners Learn?

OPINION

Let go of any doubts or confusion, make the right choice and then focus and thrive as a data scientist.

Photo by AMIT RANJAN on Unsplash
Photo by AMIT RANJAN on Unsplash

I currently lead a research group with data scientists who use both R and Python. I have been in this field for over 14 years. I have witnessed the growth of both languages over the years and there is now a thriving community behind both.

I did not have a straightforward journey and learned many things the hard way. However, you can avoid making the mistakes I made and lead a more focussed, more rewarding journey and reach your goals quicker than others.

Before I dive in, let’s get something out of the way. R and Python are just tools to do the same thing. Data Science. Neither of the tools is inherently better than the other. Both the tools have been evolving over years (and will likely continue to do so).

Therefore, the short answer on whether you should learn Python or R is: it depends.

The longer answer, if you can spare a few minutes, will help you focus on what really matters and avoid the most common mistakes most enthusiastic beginners aspiring to become expert data scientists make.


Understand your Data Science Career Goals

First off, try to be clear on what your overall goal is. Why are you even reading this article? It’s because you have made a decision to become an expert in data science or at least become better than your current self. You now have a decision to make on what to invest your limited time on. Maybe you are hoping to get your first job in data science. Or maybe you are hoping to get a better job. Or perhaps you have data and you want to work on your own data science project.

Whatever your circumstances, in the end, you need to have hands-on, rich experience working on a data science project. To get that, you should have experience of every stage of a typical real-world data science project (from data collection to final interpretation).

A Kaggle project is not good enough!


Understand a Typical Real-World Data Science Project Requirements

A typical, real-world data science project involves collecting data, getting it into a platform, precisely defining the problem, processing and cleaning data, visualizing data, applying various models, assessing various models, and telling a story.

For you to achieve any of your data science career goals, you need to have a clear understanding of the complete project pipeline. You need to know the models you have chosen and why they are the most appropriate. You need to be clear on the visualization graphs you use. You need to tell a story that makes sense!

All these tasks can be done in both Python and R.

You could be choosing the wrong model and make that mistake in both Python and R.

If I am recruiting you to my team, I couldn’t care less whether the tool you use is R or Python. However, what would matter is how you chose your model, how you framed your problem, and how well can you explain and justify your approach. There will always be new tools emerging. And I would expect you to keep learning those tools as and when they are required for a project.

However, if you can’t tell me a convincing data science story and answer basic language-independent questions, I won’t hire you!


Avoid This Common Mistake

Both R and Python are likely to support 99% of use cases of any typical data science project. However, many aspiring data scientists make the mistake of giving undue importance to tools.

Some of them, in trying to be more efficient, try to learn the syntax of both R and Python. Frankly, this isn’t the best way to spend your learning time. You can’t go wrong with attempting to become an expert in either. But dividing your limited time on two tools that leaves you with an average skill for two tools and an expert in none is a waste of time.

Don’t be a jack of all trades and master of none, when it comes to Python vs R.


Get the Right Mindset

You need to keep an open mind and treat these languages as tools. You should certainly pick one language, get thorough experience in it so that it becomes your go-to language.

You need the go-to language to do rapid prototyping and conducting quick and preliminary analyses.

However, do not hesitate to pick up the second language if there is a specific aspect of a project that you need to work on. For example, Python is going to be superior when it comes to deploying models. However, R is likely to be superior for statistical analyses. But these are not set in stone and the tools are rapidly evolving.


What Should You Learn First?

Here is a rough guide that isn’t set in stone but you can use it as your starting point if you are completely clueless.

If you are part of a team and they are using a certain language, then it makes sense to adopt that language to make your life easier (for sharing code and expertise).

If you are doing a specific course, again it makes sense to choose the language that the course is supporting.

If you already have a background in Programming, then Python might seem more natural.

If you are likely to focus primarily on statistical analyses and you are in research, R may be more appropriate as a first choice.

Here is a detailed infographic if you want a more detailed comparison. Frankly, though, I would not take anything there to be set in stone. Many of such comparisons become outdated as the tools evolve!


Do Real-World Projects

Hermann Ebbinghaus, a German psychologist, conducted memory experiments on himself in the late 19th century and produced the well-known "forgetting curve" (replicated in modern studies). You will forget whatever you learn at an exponential rate. Within a week, you are likely to forget most of what you learn.

You can, however, remedy this situation by making your learning sessions more meanigful, interactive and relevant.

How can you do that when it comes to learning R or Python? By doing an actual project that you find meanigful and relevant.

You don’t need to sit through hours of tutorials learning the syntax of any language for the sake of learning the syntax (Google, Stack Overflow, and even an appropriate IDE can correct syntax). It’s the experience of doing real-world projects that will help solidify your learnings and enable you to have a language as your go-to language.


Final Thoughts

Make sure that you become really proficient in one language and can comfortably use their syntax (without too much googling), and debugger. You want to attain a level so that you are able to carry out quick analyses if you need to.

At the same time, as you become more proficient, keep an open mind and don’t shy away from learning a different language if there are specific aspects that could be better served by a different language or tool.

At the end of the day, these are just tools and it is not a competition.

Read every story from Ahmar Shah, PhD (Oxford) (and thousands of other writers on Medium)


Related Articles