University of Massachusetts Amherst

Search Google Appliance

Links

Marco Serafini - Democratizing Graph Analytics

DS Seminar
March 30
Computer Science Building, Room 150/151


Marco Serafini
Qatar Computing Research Institute

Title:  Democratizing Graph Analytics
 

Abstract:  Graphs are a natural and increasingly popular data representation in a large number of fields, from the Web, to advertising and biology, to metadata modeling. There is a rich and growing literature on algorithms for graph analytics and graph mining. However, handling graph data, building effective graph analytics pipelines, and selecting the right analysis algorithms still requires specific expertise that is rare in average practitioners. One key element in making graph analytics more accessible is better systems support. Systems should have programming abstractions that simplify the implementation of graph analytics tasks and the interpretation of their results. This talk will present two efforts in this direction. The first is Arabesque, a system for distributed graph mining. Arabesque defines a high-level filter-process computational model that simplifies the development of scalable graph mining algorithms such as finding frequent subgraphs or cliques. Implementations on top of Arabesque require a handful of lines of code, scale to trillions of subgraphs, and represent in some cases the first available distributed solutions. The talk will then present QGraph, a system for parallel graph search that distributes sequential graph search algorithms, balances load, and minimizes coordination. QGraph supports "heavy" searches that return a very large number of results, making graph search a viable filtering step in graph analytics pipelines.

 

Bio:  Marco Serafini is a Scientist at the Qatar Computing Research Institute, where he develops programming abstractions and systems for scalable graph search, exploration, and mining. He also works on elasticity and load balancing for real-time distributed data management systems, as well as on distributed coordination. His work has appeared in venues such as SOSP, NSDI, VLDB, ICDE, DSN, and PODC. He serves or has served as PC member of SOSP, VLDB, Eurosys, ICDE, ICDCS, and WWW, among others, and he co-chaired the PaPoC workshop, which is co-located with Eurosys. Before QCRI he was with Yahoo! Research, where he worked on the Zookeeper coordination system. Marco got his PhD from TU Darmstadt, Germany.

A reception will be held at 3:40pm in the atrium, outside the presentation room.