The world’s leading publication for data science, AI, and ML professionals.

Influential Communities in Social Network : Simplified

Take a peek into the world of discovering influential groups in a large scale network

s

Why study Communities in Social Networks?

Social networks are a prime example of large scale real-world networks. They exhibit varying behavior from densely packed sections of the network to sparsely connected individuals, and from rigid connections to highly dynamic sections of the graph. This makes them an interesting problem to analyse among Data Scientists.

Community or modular structure is considered to be a significant property of large scale real-world graphs. Disciplines where systems are often represented as graphs, such as sociology, biology and computer science also contain examples of such large scale networks and community structures present in them, whose applications can provide great incentives to carry research in this field.

How do you formally define influential communities?

Influence of an individual is defined by how many connections does the individual have. Obviously, an intuition comes to mind about individuals who are connected to highly ‘influential’ people but don’t have too many connections of their own. These people should also be considered influential.

While there have been various proposed definitions of influence handling such cases, the most prevailing and accepted definition of influence is still just the number of connections the individual have and we will work with that for now.

A community is only as strong as it’s weakest link. While defining the influence of a community, many different methods have been proposed like taking the average of all the individuals present in the communities or just simply the sum. But the definition that has been widely agreed upon and commonly used is using the influence of the least influential individual present in the community as the influence of the community.

What is a k-core?

The k-core of a graph is calculated by removing all the nodes in the graph with number of connections or degree less than ‘k’. Note that, after the removal of all such nodes from the graph, if the new graph now contains nodes with degree less than ‘k’ (due to the removal of those previous nodes), they are also removed and this process is repeated until all the nodes in the graph have degree atleast ‘k’.

A k-influential community is defined as a community (a connected subgraph) whose influence is atleast ‘k’. In layman terms, every connected component present in the k-core of a graph is a k-influential community. This has been leveraged by a lot of algorithms, allowing them to work with finding k-cores and relating them to communities.

Memory Issues

As should be evident by the term ‘large scale’ graphs, one of the biggest issues that these algorithms face is not being able to handle the complete data at the same time due to memory restrictions. These restrictions force the researchers to be creative and introduce local and online searches. These methods work on heuristics that may not return the globally best results, but are fairly close to it and can work on ridiculously large graphs.

What’s next?

Influential community search is a relatively younger field and even the basic definitions like the definition of a community, its influence etc. are still under debate, as pointed out in the discussions above. Thus, one possible direction of future work can be exploring even more definitions of cohesiveness in order to obtain tightly knit communities.


The details presented in the blog are minimal. Please refer to our paper here for a more detailed survey.


This blog is a part of an effort to create simplified introductions to the field of Machine Learning. Follow the complete series here

Machine Learning : Simplified

Or simply read the next blog in the series

Deep Learning – Model Optimization and Compression: Simplified


References

[1] Ganesh, Prakhar, Saket Dingliwal, and Rahul Agarwal. "Literature Survey on Finding Influential Communities in Large Scale Networks." arXiv preprint arXiv:1902.01629 (2019). [2] Rong-Hua Li, Lu Qin, Jeffrey Xu Yu, and Rui Mao. Influential community search in large networks. Proc. VLDB Endow., 8(5):509– 520, January 2015 [3] Jianxin Li, Xinjue Wang, Ke Deng, Xiaochun Yang, Timos Sellis, and Jeffrey Yu. Most influential community search over large social networks. pages 871–882, 04 2017. [4] Rong-Hua Li, Lu Qin, Jeffrey Xu Yu, and Rui Mao. Finding influential communities in massive networks. The VLDB Journal, 26(6):751–776, December 2017 [5] Fei Bi, Lijun Chang, Xuemin Lin, and Wenjie Zhang. An optimal and progressive approach to online search of top-k influential communities. Proc. VLDB Endow., 11(9):1056–1068, May 2018


Related Articles