The world’s leading publication for data science, AI, and ML professionals.

Twenty questions to test your R knowledge

Find out if you are a supeRstar with these twenty short and fun questions

Nguyen Dang Hoang Nhu on Unsplash.com
Nguyen Dang Hoang Nhu on Unsplash.com

While not as popular as Python, R has a strong and growing user base as a programming language, and as an applied statistician it will always be my language of choice. There are many types of R users in my experience. There are those that just scrape by on enough knowledge to finish their stats homework assignments, there are those that use it on a regular basis but mostly work around convenient data wrangling packages like dplyr, and there are those with a deep knowledge of the language and its underlying structures.

Where do you sit? Here are twenty questions to test your R knowledge. Try to answer them without actually running code, and then check the answers below to see how you did. Then maybe get a friend to test themselves and compare notes. I wrote this quiz for fun, so please don’t use it for serious stuff like Data Science interviews – it’s not intended for that!

The Quiz

Question 1: x <- vector(). What is the data type of x?

Question 2: y <- 2147483640L:2147483648L. What is the data type of y?

Question 3: z <- 0/0. What is the class of z?

Question 4: If v <- complex(1,1), what is the output of c(v, TRUE)?

Question 5: A homogeneous 1-D and 2-D data structure in R is called an atomic vector and matrix respectively. What is the name for a) a 1-D heterogeneous data structure, b) a 2-D heterogeneous data structure and c) an _n-_dimensional data structure where n > 2?

Question 6: What is the significance of the terms Camp Pontanezen and Kite-eating Tree to R? What is the origin of these terms?

Question 7: What will happen in each of the following cases if the package dplyr is not installed?

Case 1:

library(dplyr)
mtcars %>% 
  group_by(cyl) %>% 
  summarize(mean_mpg = mean(mpg))

Case 2:

require(dplyr)
mtcars %>% 
  group_by(cyl) %>% 
  summarize(mean_mpg = mean(mpg))

Question 8: a <- c(2, "NA", 3). What is the output of sum(is.na(a))?

Question 9: What is the output ofdata()?

Question 10: What is the output of round(0.5)?

Question 11: Which of these packages is not loaded when you run the command library(tidyverse)? a) dplyr b) tidyr c) broom d) ggplot2

Question 12: In the latest R version, which of the following three code snippets does not correctly apply a function across the elements of a list l?

## A
lapply(l, function(x) x + 10)
## B
lapply(l, x -> x + 10)
## C
lapply(l, (x) x + 10)

Question 13: Take a look at the output of this code and note that it is not producing the sum of each row as we might expect? Without editing the existing code, what needs to be added to the code to correct this?

library(dplyr)
df <- data.frame(x = c(1, 2), y = c(1, 2), z = c(1,2))
df %>%
  mutate(sum = sum(x, y, z))
##   x y z sum
## 1 1 1 1   9
## 2 2 2 2   9

Question 14: Which of these packages allows users to run code from the Julia language in R? a) JuliaCall b) RJulia c) JuliaR

Question 15: Which of the following is a function in the latest version of dplyr? a) c_across b) r_across c) l_across d) s_across

Question 16: Why will the following code not work and what would need to be added to make it work?

library(tidyverse)
mtcars %>%
  nest_by(cyl) %>%
  dplyr::mutate(
    ggplot(data = data,
           aes(x = hp, y = mpg)) +
           geom_point()
  )

Question 17: What function would be used to generate random numbers from a uniform distribution?

Question 18: If x <- factor(c(4, 5, 6)) what is the output of as.numeric(x)?

Question 19: Again with x <- factor(c(4, 5, 6)) what is the difference between the outputs of str(x) and typeof(x)?

Question 20: Look at the two code snippets below. If they are both run in the latest version of R, why will Snippet A succeed but Snippet B fail?

library(dplyr)
## Snippet A
mtcars %>%
  filter(grepl("Mazda", rownames(.)))
## Snippet B
mtcars |>
  filter(grepl("Mazda", rownames(.)))

Answers

  1. x is logical. This is the default type for an atomic vector.
  2. y is a double, despite the use of the integer notationL in y. This is because the maximum value for an integer in R is 2147483647. So the last value of y is coerced to a double, and consequently since atomic vectors are homogeneous, the entire vector is coerced to a double.
  3. z is of the class numeric.
  4. The output is a vector with two elements, both 1 + 0i. Note that the first argument of complex() is length.out indicating the length of the complex vector. So complex(1,1) evaluates to 1 + 0i but complex(1,1,1) evalates to 1 + 1i. Note that TRUE will be coerced to a complex type equivalent of 1 + 0i.
  5. a) List; b) Data frame; c) Array
  6. They are the nicknames of R version releases. All version nicknames are taken from old Peanuts comic strips.
  7. In Case 1, the first line will generate an error indicating that there is no such package installed, and execution will stop. In Case 2, the first line will generate a warning, but the second line will still be executed, and will generate an error because it cannot find %>% (assuming magrittr is not attached). This is a good illustration of the difference between library() and require(). library() attaches a package, but require() evaluates whether a package has been attached, evaluating to TRUE if it has been attached and to FALSE otherwise. Using require() can make it more difficult to debug your code.
  8. This evaluates to zero. Note that "NA" is a character string and not a missing value.
  9. The output is a list of all inbuilt data sets in R.
  10. The output is zero. R follows the IEC 60559 standard, where .5’s round to the nearest even number.
  11. broom is not loaded, as it is not a package in the "core tidyverse". Note that install.packages("tidyverse") will install broom together with all the packages in the core and extended tidyverse.
  12. Option B will not work. Note that option C is the new anonymous function syntax released in R 4.1.0.
  13. The code needs to contain the line rowwise() %>% before the mutate statement to declare that the function is to be applied row-by-row.
  14. JuliaCall
  15. c_across. It is the equivalent of the across() function but for row-wise operations.
  16. This code is attempting to generate a column of plots. This would need to be declared as a list column as follows:
library(tidyverse)
mtcars %>%
  nest_by(cyl) %>%
  dplyr::mutate(
    list(ggplot(data = data,
                aes(x = hp, y = mpg)) +
                geom_point())
  )
  1. runif()
  2. The output is the vector c(1, 2, 3). Factors are converted to their integer representations.
  3. str(x) gives the structure of x which is Factor. typeof(x) gives the storage mode of the data in x which is Integer.
  4. Snippet B uses the new native pipe operator |> . Unlike the pipe operator %>% in Snippet A, the native pipe only pipes into the first unnamed argument of a function, and will not accept . to pipe into other arguments. To obtain the same output as Snippet A, the following use of an anonymous function will be needed:
mtcars |>
  {(df) filter(df, grepl("Mazda", rownames(df)))}()

How did you do?

If you scored 5 or less, you urgently need a tutorial in base R to avoid spending too much time resolving unnecessary errors in your code.

If you scored 6–10, you likely have a similar level of knowledge to most R users.

11–15 is a very good score, you clearly know a lot of the underlying principles and structures of the R Programming language.

If you scored 16–20, you are a supeRstar. You probably know a lot of needless R trivia, and you might well be an R pedant. I hope you are helping others on StackOverflow.


_Originally I was a Pure Mathematician, then I became a Psychometrician and a Data Scientist. I am passionate about applying the rigor of all those disciplines to complex people questions. I’m also a coding geek and a massive fan of Japanese RPGs. Find me on LinkedIn or on Twitter. Also check out my blog on drkeithmcnulty.com or my soon to be released textbook on People Analytics._


Related Articles