The world’s leading publication for data science, AI, and ML professionals.

6 Questions to Understand A/B Testing

Go beyond the data at hand.

Photo by Joshua Earle on Unsplash
Photo by Joshua Earle on Unsplash

Statistics is an integral part of Data Science. It not only helps us understand, explore, and evaluate data but also go beyond what is in the hand.

The part of statistics help us reach conclusions that go beyond what is in the hand is called inferential statistics.

Consider we have a task to find if college students who sleep 8 hours a day get higher grades than the ones who sleep less than 6 hours a day. In order to make a thorough evaluation, we need to get the grades of all the college students in this scope which is not possible nor reasonable.

What we do instead is to get a sample from each group and make a comparison based on their data.

  • The population is all elements that represent a group.
  • The sample is a subset of the population.
  • College students in the US are a population. Randomly chosen 1000 college students in the US is a sample drawn from this population.

This is where inferential Statistics come into play. It helps us validate our findings on the samples and legitimize if these results can be applied to the entire population.

A/B testing allows us to infer results about the population using the sample data. In this article, we will answer 5 questions to understand the process of A/B testing.

The questions are organized to briefly explain the entire process so they will help us gain an understanding of how A/B testing is performed.


Question 1

Your team is working on a new design for the company website. Before starting to use the new design, you want to see if the new design will increase the click-through rate (CTR). How would you approach this task?

Answer

This task can be solved with A/B testing. The A and B represent two different scenarios. In this case, one is the current design and the other one is the suggested new design.

The traffic to the website will be divided into two groups that see either the current design or the new design. The A/B testing continues for a predetermined amount of time and then the results are analyzed to decide which design is better.


Question 2

What are the sample and population in this task?

Answer

Population is the click through rate information from the start of the experiment until forever. Thus, it is impossible to collect.

The sample is the same information collected for a predefined amount of time such as 30 days.


Question 3

In the A/B testing you have just mentioned, how would you define the null and alternative hypotheses?

Answer

A/B test results are evaluated against a hypothesis. It is required to determine the null and alternative hypotheses beforehand.

The null hypothesis is in favor of the current situation and suggests no change. The alternative hypothesis is based on making a change.

  • Null hypothesis: The new design does not increase the click-through rate.
  • Alternative hypothesis: The new design increases the click-through rate.

Question 4

You gather the results of the A/B test and see that the average click-through rate is higher. Do you immediately change the website to the new design?

Answer

No. Since we are comparing the samples, a statistical significance test is required to justify the results. We need to make sure the click-through rate with the new design is not higher by random chance.


Question 5

You perform a statistical significance test (e.g. t-test or z-test) and the p-value turns out to be 0.04. Is it enough to justify that the new design is better?

Answer

First of all, the p-value is not enough to make a decision. It must be evaluated based on a predefined confidence level.

A p-value of 0.04 means that there is a 4% chance that the results are due to random chance. In other words, we are 96% sure of the results.

A confidence level should be set prior to A/B testing. If the confidence level is determined to be 95%, then we need a p-value of less than 5% (i.e. 0.05) to conclude that the alternative hypothesis is true, which is that the new design increases the click-through rate.


Question 6

What are Type I and Type II errors?

Answer

In hypothesis testing, even if we are 96% sure of our decision, there is a small chance of making a mistake. These mistakes can be classified as Type I and Type II errors.

Type I error, also known as "false positive", is the incorrect rejection of the null hypothesis or incorrect acceptance of the alternative hypothesis.

In our case, we accept the alternative hypothesis which is that the new design increases the click-through rate. If it turns out that the new design does not increase the click-through rate, then it becomes a Type I error.

Type II error, also known as "false negative", is the incorrect acceptance of the null hypothesis or incorrect rejection of the alternative hypothesis.

In our example, if we conclude that the new design does not increase the click-through rate but it actually does, then it becomes a Type II error.


Conclusion

Statistical knowledge is a must-have for data scientists. If you plan to become a data scientist, make sure to learn both descriptive and inferential statistics to a decent level.

In this article, we have covered a typical A/B testing process including the key terms and concepts such as hypothesis testing, p-value, confidence interval, sample, population, and so on.


You can become a Medium member to unlock full access to my writing, plus the rest of Medium. If you do so using the following link, I will receive a portion of your membership fee at no additional cost to you.

Join Medium with my referral link – Soner Yıldırım


Thank you for reading. Please let me know if you have any feedback.


Related Articles