Opinion

Python vs. R for Data Science

and why you are wasting your time

SudoPurge
Towards Data Science
5 min readNov 17, 2020

--

TLDR;

In short, what matters most as a beginner in Data Science is that you DO Data Science. So just go with either one of the languages and prioritize getting some projects done while sipping away at your choice of sugary beverage. That’s how you will learn the fastest.

While I may be tempted to just recommend Python straight-away (Python is my main, but I do have some working knowledge of R), I want to present an unbiased evaluation of the effectiveness of the two languages for a beginner. This is mainly because the right choice is most definitely going to depend on your own particular situation.

Why do you want to learn?

The first and probably the most important factor you must consider is the reason WHY you want to learn. If you are a trained biologist, for example, looking to pick up some programming skills so you can better understand your dataset, or you are familiar with other scientific programming languages like MATLAB, then you should consider watching some R tutorials on YouTube because it would be simpler and more intuitive for you than Python. Or if you are a software engineer proficient in other languages like C/C++ and Java and would like to pivot into Data Science, Python would be the one to go with as just like most other popular programming languages, Python is an Object-Oriented Programming (OOP) language and it would be much more intuitive to you than R. Or, maybe you have been reading up about the fascinating field of Data Science recently and would like to dabble into it. In that case, either would really be fine and it would depend more on the other factors than this one.

Are your friends/colleagues already an expert at one of these languages?

One massive advantage you may have if you are learning a new language is the support of the community. Getting help from the community is pretty much expected amongst programmers and is usually considered an important skill. As a beginner, it may be confusing to learn how to get help, especially because there aren’t many resources online in the art of getting help from the community. Building an intuition and knowing what to ask when there’s a bug in the code is essential. If you know someone who is proficient in Python, or if another researcher at your lab has been working with R, then your best bet would be to go with what they know because then you can always ask them questions if you get stuck.

Are you interested only in Statistics and Data Analysis, or want to learn other areas such as Machine Learning and AI?

One major difference in the utilities of Python and R is that the former is an extremely versatile language, compared to the later. Python is a full-fledged programming language, which means you can collect, store, analyze, and visualize data, while also creating and deploying Machine Learning pipelines into production or on websites, all using just Python. On the other hand, R is purely for statistics and data analysis, with graphs that are nicer and more customizable than those in Python. R uses the Grammar of Graphics approach to visualizing data in its #ggPlot2 library and this provides a great deal of intuitive customizability which Python lacks. Perhaps a little oversimplified, but it may be justified to say that if you want to be a Data Analyst R should be your preferred choice, while if you want to be a Data Scientist Python is the better option. It’s the dilemma of generalization vs. specialization.

Final Thoughts

Data Science as a distinct field emerged only in the last ten years and as a result, has been constantly evolving. But what has been consistent is that more and more of the data pipeline is being automated every day. Employees with a multitude of skills such as data engineering, data visualization, Machine Learning engineering, cloud service integration, and model deployment, are always going to be more in demand than those who specialize only in one aspect of the Data Science workflow. Much of the field’s progression has been shaped by automation and only employees with good programming skills are resistant to it. Specializing in building impressive Machine Learning models will not cut it in the near future unless of course, you are extremely good at it.

The landscape of the industry at the moment is such that, at the beginner level, there are too many candidates who are “pretty” decent for too few junior Data Science jobs that are available. But for the slightly more senior positions, there aren’t enough practitioners who are experienced or have the right skillsets. And in order to take the next step in your career, you will ultimately need to be able to understand and implement the other stages of the workflow to some degree. So why not give yourself the highest probability of success?

If you are still unsure about it, the best advice I could give is to just pick Python for now and start learning. Later on, after you have a fairly good working knowledge of it, you could also learn the basics of R. But if you really don’t feel comfortable with Python, then you know what to do. Your top priority as a beginner should be to get a feel for the core concepts of Data Science and understand how to apply these concepts in real-world scenarios first and foremost. Setting up the coding environment could be a somewhat daunting experience for someone with no previous programming or Computer Science background. However, setting it up and getting started with learning will be a much more seamless experience with R than with Python. Far too many of us dwell on the idea of being a Data Scientist, and not enough actually take actions to become one.

P.S. If you want more short, to the point articles on Data Science, Programming and how a biologist navigates his way through the Data revolution, consider following my blog.

With thousands of videos being uploaded every minute, it’s important to have them filtered out so that you consume only the good quality data. Hand-picked by myself, I will email you educational videos of the topics you are interested to learn. Sign-up here.

Thank you!

--

--