The growth of web services, sensor networks, and high capacity storage devices have led to an explosion in the quantity and diversity of data suitable for data mining. Statistical techniques are critical for data mining, but are certainly not the only important part; our work also includes novel applications, software infrastructure for large-scale analytics, privacy preservation while mining data, and new interfaces that make extend human capabilities to find patterns in data. Results so far include effective prediction of cardiac and epileptic events in medical patients, large-scale data extraction from the web, large efficiency gains in the Hadoop framework, and scalable algorithms for exploring and understanding massive graphs.
