Electrical Engineering and Computer Science


Software Seminar & AI Seminar

Apache SystemML: Declarative Machine Learning for Low-Latency to Large-Scale Deployments

Matthias Boehm


IBM Research - Almaden
 
Tuesday, December 05, 2017
3:00pm - 4:30pm
BBB 4901

Add to Google Calendar

About the Event

Declarative machine learning (ML) aims to simplify the development and usage of large-scale ML algorithms. In SystemML, data scientists specify ML algorithms in a high-level language with R-like syntax and the system automatically generates hybrid execution plans that combine single-node, in-memory operations and distributed operations on Spark. In a first part, we motivate declarative ML, provide an up-to-date overview of SystemML, its compiler and runtime, as well as APIs for different deployments. In a second part, we then discuss selected research results for large-scale ML, specifically, compressed linear algebra (CLA) and automatic operator fusion. CLA aims to fit larger datasets into available memory by applying lightweight database compression schemes to matrices and executing linear algebra operations directly on the compressed representations. In contrast, automatic operator fusion aims to avoid unnecessary intermediates and scans, as well as sparsity exploitation by optimizing fusion plans and generating code for these fused operators. Together, CLA and automatic operator fusion achieve significant end-to-end improvements as they address orthogonal bottlenecks of large-scale ML algorithms.

Biography

Matthias Boehm is a Research Staff Member at IBM Research - Almaden, where he is working since 2012 on optimization and runtime techniques for declarative, large-scale machine learning in SystemML. Since Apache SystemML's open source release in 2015, he also serves as a PMC member. He received his Ph.D. from TU Dresden in 2011 with a dissertation on cost-based optimization of integration flows under the supervision of Wolfgang Lehner. His previous research also includes systems support for time series forecasting as well as in-memory indexing and query processing. Matthias is a recipient of the 2016 VLDB Best Paper Award and a 2016 SIGMOD Research Highlight Award.

Additional Information

Sponsor(s): CSE

Faculty Sponsor: Barzan Mozafari

Open to: Public