The world’s leading publication for data science, AI, and ML professionals.

A really easy way to improve your data wrangling skills in R

I think this is the best way to practice

When I was searching for jobs, I had a lot of time on my hands. Add to that COVID, and well, I had a lot of time on my hands. One day I thought to myself that I could use a little practice with dplyr, given the relatively recent updates with dplyr 1.0.0+ and my personal philosophy that you can always practice fundamental skills. My first thought was to Google "dplyr practice problems", but if you were to do that right now, you’d find a bunch of tutorial websites that have pretty basic dplyr problems. For intermediate and advanced users of R, this is likely not what we’re looking for. Then, my next thought was "do a personal project". Countless articles and people recommend that as a way to learn and practice R. Yes, I love personal projects, but sometimes you’re looking for that quick workout, you know? Sometimes, you don’t really want to think long and hard conceptualizing a project, you just wanna get some quick reps in.

But then, it struck me. Where is a place where people can post problems that are clearly are not intuitive to them, and other people can post solutions?

Yes, that place is Stack Overflow.

I created an account (I’ve always used Stack Overflow to look up answers to my own questions, but I never thought to make an account so that I could answer other people’s questions) and set my filters to selectively view dplyrquestions:

Step 1

Step 2

Game Plan

Now that I’ve identified this resource, how did I and how can you start using it?

Treat it like a workout.

Tell yourself to try to solve at least 5 questions every day. When you hit that "Apply filter" button, you’ll get a list of questions. You can sort by "Newest" to find questions that likely don’t have answers yet. Keep in mind that the questions will have a whole range of difficulties. More often than not, you’ll encounter questions from R novices, so these questions will have fairly straightforward solutions.

If you want questions with an "answer key", leave the "Filter by" boxes blank. If you want to solve questions that don’t have public answers, filter by "No answers".

Note that you don’t have to post your answer; you can simply just try to code up the answer on your own RStudio.

Getting invested

That being said, you can try to post your solutions publicly. Stack Overflow has many aspects of a game; you can earn reputation points and badges. If you’re really gunning for those points while trying to improve your skills, you can post your answer, and if it’s good, it may be accepted as the solution or upvoted by the community. You can also try to tackle tougher questions by filtering for questions that have a bounty (if your answer to a bounty question gets accepted, you can receive a lot of reputation points).

Ultimately, I think it’s such a great practice resource if you’re looking to get some quick reps in and not spend that much time. You’re practicing your data wrangling skills to solve a short problem. Sometimes you’ll come up with a good solution; sometimes other people will come up with something nicer. Either way, you can learn from the global R community. And last but not least, you’re giving back. You’re helping out people who may have just started learning R; remember that you used to be one of those people!


Related Articles