
R doesn’t have a reputation for being a fast programming language. Compared to speedier counterparts like C++, it’s sluggish at best. That said, you can make your R code run a lot faster if you understand a key concept that underlies the entire language.
This concept is called vectorization, and you can learn about it in three minutes.
In R, vectors are a basic type of variable that contain a value, or set of values. They’re very common; if you’ve ever assigned a set of numbers to a variable name, like x <- 1:50
, then you’ve created a vector. You can store most common data types in a vector, including doubles, integers, logicals, characters, and more.
Vectors can also have additional attributes. Each value (or "element") in a vector can have a name, for example. Common data structures in R like lists and dataframes are also made up of vectors with a few extra properties.
Because R is built around vectors, a lot of its operations automatically work across all the elements in a vector. In other words, they are "vectorized". To explain, here’s an example of what happens when you create a vector of integers and then multiply it by 2.

As seen above, the multiplication operation applies to all the elements of the vector. What’s more, this doesn’t only work with operators. A lot of common functions are vectorized, meaning that they work across vector elements in exactly the same way as the example above. For instance, did you know that paste functions are vectorized?
At this point, you might be wondering why R users don’t simply loop through vector elements. This is the standard approach in most programming languages, so why shouldn’t you do the same in R? This is the time-saving part.
The truth is that you can use loops in R, and sometimes there are good reasons to. However, because R is built around vectors, vectorized expressions often run many times faster than loops. Depending on your code, you can often see 10x speed gains with vectorization. Here’s an example comparing the run time for the two chunks of code in the paste example above.
The results of this code will depend a little on the specs of your machine. On my computer, the loop takes an average of 180 microseconds to run, whereas the vectorized expression takes just 18 microseconds; ten times faster! By itself, saving 162 microseconds might not seem like a big deal. But, when you’re running these same expressions hundreds of thousands of times over, this can make a huge difference to your total run time.
Speed aside, vectorized expressions can also save you space. For loops are often bulky and require initializing variables before the loop. Vectorization whittles down these lengthy code chunks to a couple of lines, or even a single line in the example above. With that, you’ll have faster code that requires less time and effort to read through; you can’t lose!
Going further with vectorization
Although I’ve given a brief introduction to vectorization, diving deeper into this concept is a must for advanced R users. Once you’re more experienced and want to learn more about vectors and other data types in R, read this chapter in Hadley Wickham’s "Advanced R".
If you’re new to vectorization, the first thing to do is simply try it out. Look over the functions you use often and see if they’re vectorized. Work some vectorized operations into your scripts. After a while, it’ll become second nature and you’ll be able to think in terms of vectors and not loops. Give it a go, and enjoy all the time you save in the process.
Want to get all my articles on R programming, Data Science, and more delivered straight to your inbox? Subscribe here.
For full access to all my stories on Medium, sign up for a membership with this link.