
Introduction
Data scientists sometimes need to examine if one categorical variable is related to another one in the same population. If the data is continuous, one can simply calculate the correlation between the variables and determine if those are highly correlated depending on the correlation coefficient. The Chi-square test is a tool to perform that analysis on categorical variables. For example, we may want to check if gender plays a role in heart disease or education is related to marital status. In cases like these, the Chi-square test is the correct analysis tool.
Background
To jump into the Chi-square test, I would like to provide a simple refresher for the related terminologys’ background. The analysis, as well as the interpretation of the output in python, needs an understanding of these terminologies.
P-value, Confidence Interval and Significance Level
P-value is the probability that the difference between the two values is there just by chance. If the p=value is small, the probability that the observed data is coming by chance is very small and therefore we conclude that there is a statistically significant difference between the observed data and the expected data.
Confidence interval is the range between which the percentage of test outcomes falls. If CI is set at 95%, it is not like that we are 95% confident about test outcomes. It can be thought as if we repeat the test 100 times, the test outcome will fall inside that range 95 times out of 100. Usually, CI is set at 95% in most cases.
Another term named significance level (alpha) is the probability of rejecting a null hypothesis when it is in fact true. It is usually set at 5% in most cases.
Chi-square test
There are few types of Chi-square tests. One type of Chi-square test is called the goodness-of-fit test which checks if one categorical variable fits well with the population data. Another type of test check the Independence of one categorical variable over another and this is called the Chi-square test of independence. In this article, I will go through the Chi-square test of independence to check if one categorical variable is related with another one by checking the Chi-square statistic as well the p-value.
Implementation in python
Let’s import the data for heart disease. It shows data of heart-related variables like systolic and diastolic pressure, diabetes, BMI, heartrate glucose level, smoking habit and much more from several individuals.

Python haschi2_contingency
module from scipy.stats
where we need to provide the contingency table. A contingency table is the summary of the relation between two categorical variables. There is a module called Pingouin which provides the contingency table if we only provide the data.
From our data, let’s say we want to check if there is a dependency of Coronary Heart Disease (CHD) on the gender distribution. Using pingouin, the code is a one-liner.

The chi2_independence returns three tables. The expected table is the contingency table showing the relationship between the two categorical variables of interest from the initial data.

To analyze the expected data, we first need to obtain the ratio across the gender in the initial data. Our data shows that the ratio between group 0 and group 1 is 2420:1820 = 1.329 and in order to be a bad predictor for CHD, the gender ratio across the groups of CHD should be similar.

We get the same 1.329:1 ratio in the expected table between different genders when we take the ratio between group 0 and group 1. For example, the gender ratio in group 0 of TenYearCHD is 2052.42/1543.56 which is equal to 1.329 (approx) and the same ratio holds for the other group.
The null hypothesis states that we expect the same ratio in the observed table. We need to validate the null hypothesis by the Chi-square statistics which is compared with the specific Chi-square value from the Chi-square table depending on the degrees of freedom and significance level. The observed table above shows the relationship between the gender category and CHD which is actually observed in the data. If we calculate the gender ratio from the observed table, we obtain 2118.5/1477.5 = 1.433 and 342.5/301.5 = 1.136 which are different from the expected ratio. Next, we need to find out the test statistic and p-value from the stats table.

The Chi-square statistic from Pearson residual is the most common statistic. Pearson residual is defined as the difference between the observed and expected value normalized by the square root of the expected value.
Pearson residual = (Observed – Expected)/(sqrt(Expected))
For this single degree of freedom and with a 5% significance level, the critical value for the Chi-square statistic is 3.841 and the test statistic is obtained at 32.618 which is much higher. This statistic is a measure of the extent to which the observed data deviates from the expected values. We have also observed a very small p-value which basically provides evidence against the null hypothesis. The smaller the p-value, the lower the chance that the observed difference is merely coming by chance. Therefore, in this case, we have strong evidence to reject the null hypothesis and state that the observed difference is real. Essentially, we can conclude that gender is a good predictor for CHD.
Extension of A/B test
The Chi-square test can be considered as an extended version of the simple A/B test which is performed between two groups to check if there is any observed difference between the groups. One group is called the control group and the other group is the treatment group. Sometimes we are interested to check multiple treatments at once and the Chi-square test provides the information of the extension of deviation of the groups from the control group. For example, to check the number of clicks on multiple versions of renovated webpage, we can essentially make more than two groups and provide them to different user groups. The contingency table should reflect the number of clicks or the number of final purchases across the newer versions of the webpage along with the initial page.
Conclusion
In this article, I have described the background of the Chi-square test and demonstrated its implementation in Python. The Chi-square test is a simple statistical test for checking the independence of categorical variables. When multiples treatments are required to check, we need to go beyond the simple A/B test and perform the Chi-square test.
Reference: