SCSFC

Robotics and AI are experiencing a radical growth, fueled by innovations in data-driven learning paradigms coupled with novel device design, in applications such as healthcare, manufacturing and service robotics. The challenges of advanced robotics and intelligent decision cannot be addressed by developing new learning algorithms or designs in isolation. The aim of my research is to bridge this gap, and enable generalization in robotic systems as we move towards human skill augmentation. I am interested enabling learning from imprecise information for performing a range of tasks with independence and flexibility. I will develop new mathematical models for data-driven optimization, coupling design of physical systems with algorithms and building new formalisms for learning based autonomy.

In this talk, I will present three specific instances of my research in robotics for interventional healthcare spanning these three areas: algorithmic design, optimization and learning from demonstrations. I will first describe how algorithmic design can improve personalized delivery of radiation therapy and enable task-level autonomy in minimally invasive surgery. Then I will discuss algorithmic paradigm towards enabling complex multi-step sequential tasks. Finally, I'll discuss motivations and methods in transferring learned controllers across dynamics models from simulation to real robots. While the algorithms and techniques introduced are applicable across domains in robotics; in this talk, I will exemplify these ideas through my work on medical robotics.

Animesh Garg is a Postdoctoral Researcher at Stanford University AI lab. Animesh is interested in problems at the intersection of optimization, machine learning, and design. He studies the interaction of data-driven Learning for autonomy and Design for automation for human skill-augmentation and decision support. Animesh received his Ph.D. from University of California, Berkeley where he was a part of the Berkeley AI Research center and the the Automation Science Lab. His research has been recognized with Best Applications Paper Award at IEEE CASE, Best Video at Hamlyn Symposium on Surgical Robotics, and Best Paper Nomination at IEEE ICRA 2015.  And his work has also featured in press outlets such as New York Times, UC Health, UC CITRIS News, and BBC Click.

Software is becoming increasingly embedded into our lives, and the potential cost of a failure is greater than ever. At the same time, as systems become more complex, our ability to predict and control their behavior seems to be lagging behind. Testing alone provides a limited level of confidence, and achieving a formal proof of correctness remains too costly to be applied to an entire system. Are there ways to design systems to be safe and secure from early on, and reduce the cost of validation

In this talk, I will argue that building a secure system requires a particular mindset that is fundamentally different from how we typically construct complex software. I will then present two approaches to aid developers in designing a system with a security mindset. First, I will describe a technique called multi-representational security analysis, which can be used to anticipate and address security attacks that exploit details across multiple abstraction layers of a system. Second, I will present a framework for reasoning about the overall safety of a system when some of its components may be compromised, and demonstrate an application to an analysis of an industrial water treatment plant. Finally, I share some future directions for designing safe and secure systems that operate in an increasingly open, evolving environment.

Eunsuk Kang is a postdoctoral researcher at the University of California, Berkeley and the University of Michigan. He received his PhD in computer science from MIT. Eunsuk's research interests lie in software engineering, with applications to system safety and security. He is especially interested in leveraging software modeling, design methodologies, and automated verification to construct dependable systems. He has received two ACM Distinguished Paper Awards (FSE16, ICSE15). Previously, he completed undergraduate studies in software engineering at the University of Waterloo.

Faculty Host: David Garlan
Institute for Software Research

A popular approach in data analysis is to represent a dataset in a high-dimensional feature space, and reduce a given task to a geometric computational problem. However, most of the classic geometric algorithms scale poorly as the dimension grows and are typically not applicable to the high-dimensional regime. This necessitates the development of new algorithmic approaches that overcome this "curse of dimensionality.” In this talk I will give an overview of my work in this area.

  • I will describe new algorithms for the high-dimensional Nearest Neighbor Search (NNS) problem, where the goal is to preprocess a dataset so that, given a query object, one can quickly retrieve one or several closest objects from the dataset. Our algorithms improve, for the first time, upon the popular Locality-Sensitive Hashing (LSH) approach.
  • Next, I will show how to make several algorithmic ideas underlying the above theoretical results practical. This yields an implementation which is competitive with the state of the art heuristics for the NNS problem. The implementation is a part of FALCONN: a new open-source library for similarity search.
  • Finally, I will talk about limits of distance-preserving sketching, i.e., compression methods that approximately preserve the distances between high-dimensional vectors. In particular, we will show that for the well-studied Earth Mover Distance (EMD) such efficient compression is impossible. The common theme that unifies these and other algorithms for high-dimensional data is the use of "efficient data representations": randomized hashing, sketching, dimensionality reduction, metric embeddings, and others. One goal of the talk is to describe a holistic view of all of these techniques.

Ilya Razenshteyn is a graduate student at the Theory of Computation group of MIT CSAIL advised by Piotr Indyk. He is broadly interested in the theory of algorithms for massive data with a bias towards algorithms which have the potential of being useful for applications. More specific interests include: similarity search, sketching, metric embeddings, high-dimensional geometry, streaming algorithms, and compressed sensing. Ilya graduated with M.Sc. in Mathematics from Moscow State University back in 2012. His awards include Akamai Presidential Fellowship and Simons Foundation Junior Fellowship.

Faculty Hosts: Anupam Gupta

Building reliable, scalable and secure software systems is essential as software pervades all aspects of our life. Existing testing and static analysis approaches are not well positioned to detect subtle defects while retaining high accuracy alongside satisfactory coverage. In order to overcome these limitations, my current research is focused on exploring the design of automated, effective, scalable and usable dynamic analysis engines.

In this talk, I will discuss our recent work on addressing the challenging problem of detecting concurrency bugs in multithreaded libraries. The intricacies associated with their detection correspond to identifying the methods that need to be invoked concurrently, the inputs passed to these methods, and the thread interleavings that cause the erroneous behaviour. We observe that the success of existing dynamic analyses is critically dependent on the availability of effective multithreaded clients. Without a priori knowledge of the defects, manually constructing defect-revealing multithreaded clients is non-trivial.

We propose approaches for directed and automated generation of effective multithreaded clients. Our techniques analyze the execution traces obtained from executing random sequential clients and produce concurrent client programs that drive shared objects via library method calls to states conducive for triggering concurrency bugs. Our evaluations demonstrate that our approaches are effective in generating clients that helped expose more than 300 concurrency bugs, including many previously unknown bugs in JDK classes which were fixed based on our reports.

Murali Krishna Ramanathan is an Assistant Professor in the Department of Computer Science and Automation, IISc, Bangalore since Nov 2012. Previously, he was a member of the analysis team at Coverity and helped build software bug detection tools that are widely used in the software industry. He obtained his PhD in Computer Science from Purdue University, USA. His research interests are in software engineering and programming languages.

He is a recipient of the ACM SIGSOFT ISSTA 2016 Distinguished Paper Award, Google Faculty Research Award in 2015, Coverity Engineering Award in 2009 and the Halstead Award for Excellence in Software Engineering Research at Purdue in 2006. He is also the winner of the Burton D Morgan Entrepreneurship Competition in 2008.

Faculty Host: Claire Le Geoues
Institute for Software Research

The proliferation of datacenters, smartphones, personal sensing and tracking devices, and home automation products is fundamentally changing the applications we interact with daily.  Today's popular user applications are no longer limited to a single desktop computer but now commonly span many mobile devices and cloud servers. As a result, existing systems often do not meet the needs of modern mobile/cloud applications.  In this talk, I will present three systems designed to tackle the challenges of mobile/cloud applications: Sapphire, Diamond and TAPIR.  These systems represent a first step towards better end-to-end support for mobile/cloud applications in runtime management, data management, and distributed transactional storage. Together, they significantly simplify the development of new mobile/cloud applications.

Irene Zhang is a fifth-year PhD student at the University of Washington. She works with Hank Levy and Arvind Krishnamurthy in the Computer Systems Lab. Her current research focuses on systems for large-scale, distributed applications, including distributed runtime systems and transactional storage systems. Before starting her PhD, Irene worked for three years at VMware in the virtual machine monitor group. Irene received her S.B. and M. Eng. in computer science from MIT, where she worked with Frans Kaashoek in the Parallel and Distributed Operating Systems Lab. Irene is supported by NSF and MSR fellowships.

Faculty Host: Mahadev Satyanarayanan
Computer Science

Natural language processing (NLP) aims to teach computers to understand human language. NLP has enabled some of the most visible applications of artificial intelligence, including Google search, IBM Watson, and Apple’s Siri. As AI is applied to increasingly complex domains such as health care, education, and government, NLP will play a crucial role in allowing computational systems to access the vast amount of human knowledge documented in the form of unstructured speech and text.

In this talk, I will discuss my work on training computers to make inferences about what is true or false based on information expressed in natural language. My approach combines machine learning with insights from formal linguistics in order to build data-driven models of semantics which are more precise and interpretable than would be possible using linguistically naive approaches. I will begin with my work on automatically adding semantic annotations to the 100 million phrase pairs in the Paraphrase Database (PPDB). These annotations provide the type of information necessary for carrying out precise inferences in natural language, transforming the database into a largest available lexical semantics resource for natural language processing. I will then turn to the problem of compositional entailment, and present an algorithm for performing inferences about long phrases which are unlikely to have been observed in data. Finally, I will discuss my current work on pragmatic reasoning: when and how humans derive meaning from a sentence beyond what is literally contained in the words. I will describe the difficulties that such "common-sense" inference poses for automatic language understanding, and present my on-going work on models for overcoming these challenges.

Ellie Pavlick is a PhD student at the University of Pennsylvania, advised by Dr. Chris Callison-Burch. Her dissertation focuses on natural language inference and entailment. Outside of her dissertation research, Ellie has published work on stylistic variation in paraphrase--e.g. how paraphrases can effect the formality or the complexity of language--and on applications of crowdsourcing to natural language processing and social science problems. She has been involved in the design and instruction of Penn's first undergraduate course on Crowdsourcing and Human Computation (NETS 213). Ellie is a 2016 Facebook PhD Fellow, and has interned at Google Research, Yahoo Labs, and the Allen Institute for Artificial Intelligence

Faculty Host: Eric Nyberg
Language Technologies

Cancer is a genetic disease, where cell division, mutation and selection produce a tumor composed of different subpopulations of cells with different complements of mutations. In the final stages of cancer progression, cancerous cells from the primary tumor migrate and seed metastases at distant anatomical sites. The cell division and mutation history of an individual tumor can be represented by a phylogenetic tree, helping guide patient-specific treatments.

In this talk, I will introduce combinatorial algorithms for constructing tumor phylogenies from bulk sequencing data, where the measurements are a mixture of thousands of cells. These algorithms are based on a combinatorial characterization of phylogenetic trees as a restricted class of spanning trees in a graph, a characterization that also demonstrates the computational complexity of the problem.

In addition, I will introduce a novel theoretical framework for analyzing the history of migrations of cells between anatomical sites in metastatic cancers.

Using these methods, I analyze several cancers and identify tumor phylogenies and migration histories that are more biologically plausible than previously reported analyses.

Mohammed El-Kebir is a postdoctoral research associate in the group of Ben Raphael at Princeton University. Mohammed received his PhD in Computer Science at CWI and VU University Amsterdam under the direction of Gunnar Klau and Jaap Heringa. He received the BioSB Young Investigator Award for his PhD thesis. In his research, he leverages a wide range of combinatorial optimization techniques to various questions in biology. Mohammed's current research focuses on tumor phylogenetics, where he develops algorithms for studying the progression of tumors.

Faculty Host: Russell Schwartz
Computational Biology

Agile micro aerial vehicles will have a massive societal impact over the next decades, creating novel opportunities for large-scale precision agriculture, fast delivery of medical supplies, and disaster response, and providing new perspectives on environmental monitoring and artificial pollination. This future requires the design of lightweight and robust perception algorithms, which interpret sensor data into a coherent world representation, in the face of measurement noise and outliers, and enable on-board situational awareness and decision-making.

In this talk, I present my work on lightweight and robust robot perception. I start by discussing an algorithm for fast visual-inertial navigation, which estimates the motion of the robot from visual and inertial cues, and demonstrate its use for agile flight of micro aerial vehicles. Then, I focus on robustness and show that fundamental insights from optimization and Riemannian geometry lead to the design of estimation techniques that are provably robust to large noise, providing the first certifiably correct algorithm for localization and mapping. I also discuss the challenges connected to scaling down perception to nano and pico aerial vehicles, where sensing and computation are subject to strict payload and power constraints. I argue that enabling autonomy on miniaturized platforms requires a paradigm shift in perception, sensing, and communication, and discuss how we can draw inspiration from nature in designing the next generation of flying robots.

Luca Carlone is a research scientist in the Laboratory for Information and Decision Systems at the Massachusetts Institute of Technology. Before joining MIT, he was a postdoctoral fellow at Georgia Tech (2013-2015), and a visiting researcher at the University of California Santa Barbara (2011). He got his Ph.D. from the Polytechnic University of Turin, Italy, in 2012. His research interests include nonlinear estimation, numerical and distributed optimization, computer vision and probabilistic inference applied to sensing, perception, and control of single and multi robot systems. He published more than 60 papers on international journals and conferences, including a best paper award finalist at RSS 2015 and a best paper award winner at WAFR 2016.

Faculty Host: Nathan Michael
Robotics

Modern programming languages safeguard developers from many typical errors, yet more subtle errors—such as violations of security policies—still plague software. Program synthesis has the potential to eliminate such errors, by generating executable code from concise and intuitive high-level specifications. Traditionally, program synthesis failed to scale to specifications that encode complex behavioral properties of software: these properties are notoriously hard to check even for a given program, and so it’s not surprising that finding the right program within a large space of candidates has been considered very challenging. My work tackles this challenge through the design of synthesis-friendly program verification mechanisms, which are able to check a large set of candidate programs against a complex specification at once, whereby efficiently pruning the search space.

Based on this principle, I developed Synquid, a program synthesizer that accepts specifications in the form of expressive types and uses a specialized type checkeras its underlying verification mechanism. Synquid is the first synthesizer powerful enough to automatically discover provably correct implementations of complex data structure manipulations, such as insertion into Red-Black Trees and AVL Trees, and normal-form transformations on propositional formulas. Each of these programs is synthesized in under a minute. Going beyond textbook algorithms, I created a language called Lifty, which uses type-driven synthesis to automatically rewrite programs that violate information flow policies. In our case study, Lifty was able to enforce all required policies in a prototype conference management system.

Nadia Polikarpova is a postdoctoral researcher at the MIT Computer Science and Artificial Intelligence Lab, interested in helping programmers build secure and reliable software. She completed her PhD at ETH Zurich. For her dissertation she developed tools and techniques for automated formal verification of object-oriented libraries, and created the first fully verified general-purpose container library, receiving the Best Paper Award at the International Symposium on Formal Methods. During her doctoral studies, Nadia was an intern at MSR Redmond, where she worked on verifying real-world implementations of security protocols. At MIT, Nadia has been applying formal verification to automate various critical and error-prone programming tasks.

Faculty Host: Jonathan Aldrich
Institute for Software Research

Web applications are affected by a wide array of security vulnerabilities. Of these, logical flaws in particular are good candidates for detection via model checking. However, approaches to analyze potential attacks often rely on manually constructed models. This limits the wider practical applicability of existing formal methods tools.

In this talk, I will present a step to bridge this gap by analyzing web application code (specifically, Java servlets) and extracting state machine models for use by model checkers which target security properties. I will discuss suitable abstractions which are essential to limit the complexity of the generated models. These include user-specified abstractions depending on the provided specifications. We have used this tool within the SPaCIoS EU research project to find vulnerabilities using an approach that can be combined with security testing and model inference. A promising future step is to leverage these models to detect more complex chained attacks.

Marius Minea is an associate professor at the Politehnica University of Timisoara, Romania. He received his PhD from Carnegie Mellon with a thesis on model checking for timed systems, advised by Ed Clarke and was then a postdoctoral researcher at the University of California, Berkeley. His research interests are at the intersection of software analysis, testing and security, where he has led several research projects.

He has taught courses on verification and security, enjoys using live coding in programming classes and has designed a discrete structures course that uses functional programming.

Faculty Host: Tom Cortina

Pages

Subscribe to SCSFC