Thursday, June 25, 2015

Edge Connectivity through Boost Graph Library

After two weeks, we have managed to interface Boost and Sagemath!

However, the interface was not as simple as it seemed. The main problem we found is the genericity of Boost: almost all Boost algorithms work with several graph implementations, which differ in the data structures used to store edges and vertices. For instance, the code that implements breadth-first search works if the adjacency list of a vertex v is a vector, a list, a set, etc. This result is accomplished by using templates [1]. Unfortunately, the only way to interface Sagemath with C++ code is Cython, which is not template-friendly, yet. In particular, Cython provides genericity through fused types [2], whose support is still experimental, and which do not offer full integration with templates [3-5].

After a thorough discussion with David, Nathann, and Martin (thank you very much!), we have found a solution: for the input, we have defined a fused type "BoostGenGraph", including all Boost graph implementations, and all functions that interface Boost and Sagemath use this fused type. This way, for each algorithm, we may choose the most suitable graph implementation. For the output, whose type might be dependent on the input type, we use C++ to transform it into a "standard" type (vector, or struct).

We like this solution because it is very clean, and it allows us to exploit Boost genericity without any copy-paste. Still, there are some drawbacks:
1) Cython fused types do not allow nested calls of generic functions;
2) Boost graphs cannot be converted to Python objects: they must be defined and deleted in the same Cython function;
3) No variable can have a generic type, apart from the arguments of generic functions.

These drawbacks will be overcome as soon as Cython makes templates and generic types interact: this way, we will be able create a much stronger interface, by writing a graph backend based on Boost, so that the user might create, convert, and modify Boost graphs directly from Python. However, for the moment, we will implement all algorithms using the current interface, which already provides genericity, and which has no drawback if the only goal is to "steal" algorithms from Boost.

As a test, we have computed the edge connectivity of a graph through Boost: the code is available in ticket 18564 [6]. Since the algorithm provided by Sagemath is not optimal (it is based on linear programming), the difference in the running time is impressive, as shown by the following tests:

sage: G = graphs.RandomGNM(100,1000)
sage: %timeit G.edge_connectivity()
100 loops, best of 3: 1.42 ms per loop
sage: %timeit G.edge_connectivity(implementation="sage")
1 loops, best of 3: 11.3 s per loop

sage: G = graphs.RandomBarabasiAlbert(300,3)
sage: %timeit G.edge_connectivity(implementation="sage")
1 loops, best of 3: 9.96 s per loop
sage: %timeit G.edge_connectivity()
100 loops, best of 3: 3.33 ms per loop

Basically, on a random Erdos-Renyi graph with 100 vertices and 1000 edges, the new algorithm is 8,000 times faster, and on a random Barabasi-Albert graph with 300 nodes and average degree 3, the new algorithm is 3,000 times faster! This way, we can compute the edge connectivity of much bigger graphs, like a random Erdos-Renyi graph with 5,000 vertices and 50,000 edges:

sage: G = graphs.RandomGNM(5,000, 50,000)
sage: %timeit G.edge_connectivity()
1 loops, best of 3: 16.2 s per loop

The results obtained with this first algorithm are very promising: in the next days, we plan to interface several other algorithms, in order to improve both the number of available routines and the speed of Sagemath graph library!


No comments:

Post a Comment