Rigel: Fast and Scalable Analysis of Massive Social Graphs

SAND Lab @ UC Santa Barbara


Updates

February 01, 2014: Bug fix for C++ rigel.

November 2013: All code packages have been updated with a LICENSE file for BSD3.

Sep 30, 2013: The C++ version of Leo has been released!

August 2013: Our paper on asymmetric graph embeddings got accepted at VLDB 2014!

May 5, 2013: v0.1.0-RC2 of the JVM implementation of GCS released!

May 1, 2013: v0.1.0-RC1 of the JVM implementation of GCS released!

March 5, 2013: New version of Rigel released!

May 18, 2012: New version of Rigel released!

March 23, 2012: New website up and running + new funding courtesy of DARPA and HP Labs!

Dec 14, 2012: Example code is now available. Please visit the downloads section for the code.




graph

About

Analysis of large networks is a critical component of many of today's application environments, including online social networks, protein interactions in biological networks, and Internet traffic analysis. The arrival of massive network graphs with hundreds of millions of nodes, e.g. social graphs, presents a unique challenge to graph analysis applications. Most of these applications rely on computing distances between node pairs, which for large graphs can take minutes to compute using traditional algorithms such as breadth-first-search (BFS).

In this project, we study ways to enable scalable graph processing for today's massive networks. We explore the design space of graph coordinate systems, a new approach that accurately approximates node distances in constant time by embedding graphs into coordinate spaces. We show that a hyperbolic embedding produces relatively low distortion error, and propose Rigel, a hyperbolic graph coordinate system that lends itself to efficient parallelization across a compute cluster. Rigel produces significantly more accurate results than prior systems, and is naturally parallelizable across compute clusters, allowing it to provide accurate results for graphs up to 43 million nodes. Finally we show that Rigel's functionality can be easily extended to locate (near-) shortest paths between node pairs. After a one-time preprocessing cost, Rigel answers node-distance queries in 10's of microseconds, and also produces shortest path results up to 18 times faster than prior shortest-path systems with similar levels of accuracy.

Download

Leo (C++)

Leo is implemented in C++ - we suggest you compile using the GCC/G++ compiler. A Makefile is included that uses G++. Download links are below - documentation is included in the zip file.

Leo Source Code

Orion and Rigel (Scala/JVM)

Scala GCS is an implementation of both Orion and Rigel in the Scala programming language. It provides a Java interface using Java data structures. Since Scala is a JVM language and at the end of the day everything compiles down into a JAR, any JVM language may use this as well.

Latest version: v0.1.0-RC2
Please send all bug reports to: adelbertc at gmail dot com.

Please view changelogs on GitHub.

Scaladoc API
Download JAR
Source (GitHub)

Rigel (C++)

Rigel is implemented in C++ - we suggest you compile using the GCC/G++ compiler. A Makefile is included that uses G++. Download links are below - documentation is included in the zip file.

Rigel Source Code
Documentation

Publications