Code

Most of my code is directly available on github, either on my personal page or on the pages of my collaborators. What follows is a non-exhaustive list of the most useful software I developed over the past few years.

MCMC sampling of the Simplicial Configuration Model (SCM)

The Simplicial Configuration Model is random null model for simplicial complexes, mathematical objects that can be seen as high-order generalizations of simple graphs (they incorporate multi-node interactions). This C++ program (and library) is the reference implementation for the Markov chain Monte Carlo (MCMC) sampler discussed in this paper.

MCMC sampling of the Stochastic Block Model (SBM)

Basic sampling tool for the canonical SBM. Implements the Metropolis Hasting algorithm, both for sampling from the posterior distribution of the model, and for maximizing its likelihood (through simulated annealing). The algorithm is not as efficient as Belief Propagation, but it works on dense graphs. See the Supplemental Material of this paper for more information.

Structural Preferential Attachment (SPA+)

SPA+ is stochastic growth process that generates realistic networks with a modular structure. The first version of the model was introduced by L. Hébert-Dufresne et al. in a series of two papers, back in 2011-2012. While this initial version, dubbed SPA, reproduced most mesoscopic properties of real networks accurately, it made strong assumptions about the structure of the communities of a network: Namely that real communities are formed by fully connected cliques of nodes (a hypothesis reminiscent of the work of Palla et al [2005], for instance). This strong hypothesis can of course be relaxed, but not in a straightforward and natural way. This observation prompted the development of an extended model (SPA+) that accounts for the heterogeneous nature of community density in real networks. See the associated publication for more information. The repository contains a fast and robust C++ implementation of both the simple model (SPA) and the extended model (SPA+).

Hierarchical Preferential Attachment (HPA)

The Hierarchical Preferential Attachment model (HPA) is a direct extension of SPA, and contains the latter as a special case. It is described in length in our Physical Review E paper. This repository does not contain a full implementation of the model (yet?), but provides useful analytical tools to play around with the mean-field equations.

cascading_detection

This repository contains a python implementation of the cascading detection meta-algorithm. It is a thin wrapper around other existing community detection algorithms. It keeps track of nodes’ identity through multiple passes of these algorithms, and produces extensive logs containing: time spent on each pass, edge list and clusters detected at each pass, etc. The wrapper currently handles CFinder (Palla et al.), link clustering (Ahn et al.) and GCE (Lee at al.), i.e. all the algorithms that are analyzed the article.

gists

I sometimes write small and self-contained examples to teach myself new languages and features. The most successful outcomes of these experiments are gathered on my github gist page. This currently includes C/C++ interface for hdf5, boost::MPI, and python snippets for multiple modules, such as networkx, numpy, PIL.

cli-stats

cli-stats is a (small) bundle of small and fast c++ applications that replace some of R command line calls.