web analytics

logo Meta Data Science

By Massoud Seifi, Ph.D. Data Scientist

Publications

The Power of Consensus: Random Graphs Have No Communities

Published by IEEE CPS, 2013; with Romain Campigotto and Jean-Loup Guillaume

Abstract: Communities are a powerful tool to describe the structure of complex networks. Algorithms aiming at maximizing a quality function called modularity have been shown to effectively compute the community structure. However, some problems remain: in particular, it is possible to find high modularity partitions in graph without any community structure, in particular random graphs. In this paper, we study the notion of consensual communities and show that they do not exist in random graphs. For that, we exhibit a phase transition based on the strength of consensus: below a given threshold, all the nodes belongs to the same consensual community; above this threshold, each node is in its own consensual community.

Stable Community Cores in Complex Networks

Book (in French), published by Presses Académiques Francophones, September 2012; Original title: Cœurs stables de communautés dans les graphes de terrain

Abstract: In many contexts, sets of related entities can be modeled by graphs, in which entities are represented by nodes and relationships between these entities by edges. These graphs, which we call complex networks, may be encountered in the real world in various fields such as social sciences, computer science, biology, transportation, language, etc.

Most complex networks are composed of dense subgraphs weakly interconnected called communities and many algorithms have been proposed to identify the community structure of complex networks automatically. In this book, we focused on the problems of community detection algorithms, including their non-determinism and the instability that results. We presented a methodology that takes advantage of this non-determinism to improve the results obtained with current community detection techniques.

We proposed an approach based on the concept of strong communities, or community cores, and we showed the improvement made by our approach by applying it to real and artificial graphs. We also studied the structure of cores in random graphs and we showed that unlike classical community detection algorithms which can find communities in graphs with no intrinsic community structure, our approach clearly indicates the absence of community structure in random graphs and, in this way, allows to distinguish between random and real graphs.

We also studied the evolution of cores in dynamical networks using a simple and controllable simulated dynamic and a real dynamic. We showed that cores are much more stable than communities obtained by current community detection techniques and our approach can overcome the disadvantages of stabilized methods that have been recently proposed.

Community cores in evolving networks

Published by WWW ‘12 Companion / ACM, 2012; with Jean-Loup Guillaume

Abstract: Community structure is a key property of complex networks. Many algorithms have been proposed to automatically detect communities in static networks but few studies have considered the detection and tracking of communities in an evolving network. Tracking the evolution of a given community over time requires a clustering algorithm that produces stable clusters. However, most community detection algorithms are very unstable and therefore unusable for evolving networks. In this paper, we apply the methodology proposed in [seifi2012] to detect what we call community cores in evolving networks. We show that cores are much more stable than “classical” communities and that we can overcome the disadvantages of the stabilized methods.

Stable community cores in complex networks

Published by Springer, 2012; with Jean-Loup Guillaume, Ivan Juiner, Jean-Baptiste Rouquier and Svilen Iskrov

Abstract: Complex networks are generally composed of dense sub-networks called communities. Many algorithms have been proposed to automatically detect such communities. However, they are often unstable and behave nondeterministically. We propose here to use this non-determinism in order to compute groups of nodes on which community detection algorithms agree most of the time.We show that these groups of nodes, called community cores, are more similar to Ground Truth than communities in real and artificial networks. Furthermore, we show that in contrary to the classical approaches, we can reveal the absence of community structure in random graphs.

Interactive multiscale visualization of large graphs. Application to a network of weblogs

Published(in French) by Revue d’Intelligence Artificielle / Lavoisier, 2012; Original title: Visualisation interactive multi-échelle des grands graphes. Application à un réseau de blogs

Abstract: Many real-world networks can be represented as large graphs. Reducing the complexity of a graph so that it can be easily interpreted by the human eye is a valuable help to understand and analyze this type of data. We compare two approaches to grouping of nodes in communities and propose a multi-scale interactive visualization of large graphs based on these hierarchical classifications of nodes that allow us to represent these graphs in a legible and interpretable way. We then apply our methodology to a network of French-speaking weblogs to quickly illustrate the advantages and disadvantages of this approach.

Maps of paedophile activity

Technical report (Measurement and Analysis of P2P Activity Against Paedophile Content project), 2010; With Matthieu Latapy, Clémence Magnien and Raphaël Fournier

Abstract: As policy-making and law enforcement institutions generally operate at the national level, or at least at a regional level (Europe for instance), we studied geolocated recordings available in a large dataset obtained by a measurement of keyword-based queries submitted to a large P2P server. We observed that the fractions of paedophile queries may be orders of magnitude larger in some countries than in others. We also investigated the possibility of building topical maps of activity using community detection methods, which cluster paedophile files in a small number of communities. Obtained maps show that some communities do indeed have a high fraction of paedophile files (compared to others), but the relations between them are still unclear.

Comments