[ Paper Summary ] An Introduction to Independent Component Analysis: InfoMax and FastICA algorithms

Jae Duk Seo
Towards Data Science
4 min readSep 5, 2018

--

GIF from this website

Please note that this post is for my future self to look back and review the materials on this paper.

Paper from this website

Abstract

In this paper the authors gives an introduction for independent component analysis which is different from principle component analysis. In which optimize for statistical independence of given data.

Introduction

Now days it is extremely easy to perform some kind of data analysis, and there are varieties of methods such as pca or factor analysis. And one important concept related to this is the distribution of the given data, specifically the normality of the distribution. The reason is because this determines whether certain methods will be successful at decomposing certain data. As seen above, depending on the method and the assumed distribution, some methods will be successful at correctly identifying independent signals while the others fail. (Note ICA have it’s own limitations as well, related to permutations or signs, but we also have a method called independent vector analysis.)

Theoretical foundations of ICA

In this section the authors briefly discuss about the basic principles of ICA, such as finding the un-mixing matrix which is an inverse of mixing matrix. And the authors denote 5 assumptions about ICA.
1) The the sources are statistically independent.
2) The mixing matrix is square and full rank.
3) There should be no external noise
4) Data is zero mean
5) source signals must not have a Gaussian probability density function. (At least one of them.)

Statistical independence

When we have two random variables x1 and x2 we can define uncorrelatedness between those two variables by the equation below.

On the other hand we can define the statistical independence as equation seen below.

And in a specific case when the joint pdf is gaussian uncorrelatedness is equivalent to independence. And the authors introduce two methods to measure independence which are minimizing mutual information, or maximization of non-Gaussianity. ( The are the same solution. )

Minimization of mutual information

When we have two variable X and Y, mutual information can be seen as the reduction of uncertainty regarding variable X after the observation of Y. Therefore by having an algorithm that seeks to minimize mutual information, we are searching for components (latent variables) that are maximally independent. (InfoMax is the name of the algorithm).

Maximization of non-Gaussianity

When we have two variables X and Y, we can achieve independence by forcing each of them to be as far from the normal distribution as possible. And to do this we measure the non-gaussianity by using negentropy. (which is a positive measure of gaussianity.). And we calculate the approximated negentropy rather than direct calculation. (FastICA is the name of the algorithm.)

I am not going to write anything about ‘How to use the ICA packages’ since it is just using highlevel api’s.

Example / Discussion

The authors of this paper have mixed three images and performed different methods of ICA to extract the original signals.

As seen above, we can observe that ICA methods are able to extract the signals (original images) clear than PCA. Additionally, there are multiple concepts that must be taken into consideration when performing ICA, such as whitening the data. Both FastICA as well as InfoMax are robust, however right type of distribution must be provided before hand.

Reference

  1. (2018). Mail.tqmp.org. Retrieved 4 September 2018, from http://mail.tqmp.org/RegularArticles/vol06-1/p031/p031.pdf

--

--

Exploring the intersection of AI, deep learning, and art. Passionate about pushing the boundaries of multi-media production and beyond. #AIArt