14 Probability problems for acing Data Science interviews

Decrypting probability questions in data science interviews with style

Aakash Agrawal
Towards Data Science

--

photo on unsplash.com by @briansuman

Questions on probability are very common in any Data Science interview. The questions might be challenging (and tricky) but are easily tractable if you have some practice and know basic formulas and concepts. In this blog, I share some practice questions (with solutions) on different concepts in probability.

Keywords: Probability, Binomial distribution, Bayes theorem

The blog assumes the reader knows basic probability formulas and concepts. Please refer to the reference section for some good reading materials on these concepts. I would recommend the readers to first try and solve the questions themselves on a piece of paper and then head to the hints and solutions. Note: there might be many different ways of doing the same problem. The solution provided is just one of the ways (might be the only way).

Basic Probability

Q1. 3 vertices (corners) of a regular hexagon are randomly joined. What is the probability that an equilateral triangle is formed?

Note: In a regular hexagon all sides and angle measurements are equal. And an equilateral triangle has all three sides equal. Answer = 0.1.

Answer Q1. Image by the author.

Q2. 3 persons A, B, C independently fire at a target. What is the probability that (i) Exactly one of them hits the target, (ii) At least one of them hits the target? Given: Probability of hitting the target. P(A) = 1/6, P(B) = 1/4, P(C) = 1/3.

(i) Probability that exactly one of them hits the target requires the other two not to hit the target. We can readily see three cases for this event to happen. Finally, we calculate the probability by taking a union of those cases.

(ii) Probability that atleast one of them hits the target is solved by creating several cases and taking a union as we did in the earlier part. A much easier approach will be to calculate the negation of the same event and subtract it from 1. (Since the firings are independent P(ABC) becomes P(A)P(B)P(C)).

Answer Q2. Image by the author.

Q3. The probability that a teacher takes a surprise test is 0.55. If a student remains absent for two days. What is the probability that he misses exactly one test, and atmost one test?

(i) Similar to the previous question. (ii) Missing atmost 1 test means missing either 0 test or 1 test.

PS: this is similar to the Uber and Lyft problem.

Answer Q3. Image by the author.

Q4. A box contains 2 defective pens and 3 working pens. Pens are tested one by one until both defective ones are discovered. What is the probability that the testing procedure comes to an end at the end of (i) 2nd testing, (ii) 3rd testing?

For the test to come to an end at the end of two checks, the first two starting pens need to be defective. For the test to come to an end at the end of three checks, we can create cases and take the union.

Answer Q4. Image by the author.

Q5. If there are 30 people in a room, what is the probability that everyone has different birthdays? Assume 365 possible birthdays in a year.

Answer Q5. Image by the author.

Algebraic Problems

Q6. An amoeba has a 25%, 25%, and 50% chance of producing 0, 1, or 2 offspring, respectively. Each of the amoeba’s descendants also has the same probabilities. What is the probability that the amoeba’s lineage dies out?

For the amoeba lineage to die it needs to produce 0 offspring. If it produces 1 offspring then the child offspring must produce 0 offspring. Similarly for two children's offspring.

Answer Q6. Image by the author.

Q7. The entries in a 2 x 2 matrix are integers that are independently chosen for each entry. The probability that the entry is odd is p. If the probability that the value of the determinant is even is 0.5, find p.

The probability that the determinant is odd/even can be computed by making cases for odd/even and then taking a sum of the probabilities of those cases.

Answer Q7. Image by the author.

Binomial Distribution

Q8. A drunker takes either a step forward or backward. The probability that he takes a forward step is 0.4. Find the probability that at end of 11 steps he is 1 step away from the starting point?

Clearly, for the drunker to be 1 step away from start he can either take 5 steps forward (meaning 6 backward steps) hence ending 1 step behind the start OR he can take 6 forward (meaning 5 backward steps) hence ending 1 step front of the start. The final probability can be calculated by taking a union of the two events.

Answer Q8. Image by the author.

Q9. A coin is twice as likely to land head as a tail in a series of independent tosses. Find the probability that 3rd head occurs on the 5th toss.

For the 3rd head to occur in the 5th toss, the earlier 2 heads can occur in any of the 4 tosses initially, which becomes a case of the binomial distribution.

Answer Q9. Image by the author.

Law of Total Probability

Q10. A rich lady has 4 compartments in her purse. 1st compartment has 1 Rs (Rupees) and 2 Pasie coins. 2nd has 2 Rs and 3 Paise coins. 3rd has 3 Rs and 4 Paise coins. 4th has 4 Rs and 6 Paise coins.

She selects a random compartment & draws a coin, what is the probability that the drawn coin is a rupee coin?

Answer Q10. Image by the author.

Bayes Theorem

Q11. An HIV test is 99% accurate (both ways). Only 0.3% of the population is HIV +. What is the probability that a random person is HIV + given that the person tests +?

Answer Q11. Image by the author.

Q12. A speaks truth in 70% of cases, B in 50% of cases. Find the probability that they will speak the same thing while describing a certain event?

Answer Q12. Image by the author.

Miscellaneous Card Problems

Q13. Cards are dealt one by one from a pack of 52 well-shuffled cards. What is the probability that exactly ‘k’ cards are dealt before the 1st ace appears?

We are indirectly looking for the probability that the 1st ace appears in the (k+1)th card. () is the standard combination’s notation.

Answer Q13. Image by the author.

Q14. All face cards are removed from a pack of 52 well-shuffled cards. From the remaining 40 cards, 4 cards are drawn randomly. What is the probability that 4 cards are from different suits and denominations?

Total suits = 4 (Spade, Hearts, Clubs, Diamonds); Total denominations = 13 (2, …, 10, A, J, Q, K).

Answer Q14. Image by the author.

References

[1]. (Must Read) A summarised theory of probability concepts (includes laws and theorems): https://www.cuemath.com/data/probability/

[2]. Basic Probability concepts: https://seeing-theory.brown.edu/basic-probability/index.html (chance events, expectations, etc.)

[3]. Conditional Probability: http://www.stat.yale.edu/Courses/1997-98/101/condprob.htm

[4]. Law of Total probability: https://youtu.be/7t9jyikrG7w

[5]. Bayes theorem (3blue1brown): https://youtu.be/HZGCoVF3YvM

[6]. Binomial distribution (Khan Acad.): https://youtu.be/WWv0RUxDfbs

[7]. Permutations and Combinations: https://youtu.be/XJnIdRXUi7A

[8]. Some recommended YouTube channels on probability and statistics: @jbstatistics, @TheOrganicChemistryTutor

I hope it was some good practice on probability and that you enjoyed solving these questions. I would be glad to know any different approaches to solve the above problems. I would be happy to answer any doubts or discussions on any of the questions mentioned above. Feedback is highly appreciated. A clap 👏🏼 is also a good feedback 😇. Good luck with your next data science interview. You can reach me via Linkedin.

ThankYou!

--

--