Speeding up maximal clique search in igraph

Maximal clique finding has been part of igraph for quite a long time (the first commit of this algorithm was well back in 2007 IIRC), but out of sheer laziness I simply implemented it as a search for maximal independent vertex sets on the complementary graph. As Drew Conway pointed it out in one of his SNA talks, this is a terribly slow approach for sparse graphs as it takes ages to build the complementary graph. The implementation of the Bron-Kerbosch algorithm in last September (see this commit in Launchpad) speeded up things big time, and at that point I thought this is pretty much as good as it gets - but I was wrong. There was still plenty of space for improvement. Inside the Bron-Kerbosch algorithm, intersections between sorted sets are calculated all the time. igraph used the naïve set intersection algorithm in this case, which is O(n+m) where n and m are the sizes of the larger and smaller sets, respectively. Linear complexity is not bad, but there are better solutions, such as the set intersection algorithm of Ricardo Baeza-Yates, which attains log-log complexity at the expense of requiring random access to the set elements. This does not pose a limitation as the sets in question in igraph are randomly accessible, so I thought I'll give it a try. Basically, the set intersection algorithm of Baeza-Yates goes as follows:

  1. Pick the median element A of the smaller set.
  2. Search for the insertion point B of this element in the larger set.
  3. If A equals B, add A to the result set.
  4. Repeat the previous three steps for non-empty subsets to the left and right of A and B.

A big kudos goes to Erik Frey for drawing my attention to the above procedure. According to his measurements, the algorithm works even better if one uses interpolation search instead of binary search, but it did not yield performance improvements compared to the binary search case in igraph, so I decided to go with binary search only. Some quick measurements on geometric random graphs with an increasing number of vertices (while keeping the density constant) show that indeed the improvements in the set intersection algorithm speeded up the maximal clique search significantly (red line shows the old Bron-Kerbosch implementation with linear set intersection, blue line shows the new one):

Chart showing calculation times

This will be included in the next major release of igraph.

No Trackbacks

You can leave a trackback using this URL: http://sixdegrees.hu/2010/03/speeding-up-maximal-clique-search-in-igraph/trackback/

4 Comments

  1. Noah

    I am currently working on a project for which I use maximal clique finding in igraph on large graphs (10^4 vertices, 10^6 edges). Speedups would be very helpful! Any idea when the next major release of igraph will be? Is it already checked in to the LaunchPad repo?

    Posted July 12, 2010 at 8:55 pm | Permalink
  2. The above mentioned improvements are already checked in into Launchpad in the 0.6 branch. There are no precompiled packages for it yet (if you mean the packages in the Launchpad PPA) since things change too frequently at the moment; however, you can download nightly snapshots from Google Code at http://code.google.com/p/igraph. You have to compile these yourself, but if you need help, just drop me an email.

    As for the next major release, well, I don’t know, it’s Gábor’s decision, so it’s better to ask him on the mailing list.

    Posted July 12, 2010 at 9:00 pm | Permalink
  3. mark

    yeah, noah. they’re already checked in.

    Posted July 14, 2010 at 9:55 am | Permalink
  4. Noah

    I’ll give building it from source a try. Thanks for the help Tamas!

    Posted July 14, 2010 at 7:38 pm | Permalink

Post a Comment

Your email is never shared. Required fields are marked *

*
*

Powered by WP Hashcash