Yuriy Brun's Research

Current Projects:

Past Projects:

Other Projects:


Crystal: Proactive Detection of Collaboration Conflicts

Whenever developers program, they are constantly faced with a barrage of decisions: should I fix a compilation error, perform a refactoring, write more tests, merge my changes with those of other developers on my team, etc. Developers typically leverage intuition to make these decisions, and are often quite good. Nevertheless, even when developers choose the “right” goal, there are typically several possible approaches to address that goal and developers must, again, rely on intuition and experience to select the approach to pursue. My belief is that IDEs can leverage unused cycles to speculate about the decisions and approaches the develops may pursue, execute the relevant operations, perform analysis on the outcomes, and deliver to the developers relevant information, pertinent specifically to their context, about the outcomes of pursuing those decisions and approaches. Crystal is a tool I built to improve developer awareness of the interaction between their work and that of their collaborators. Such developers often encounter unexpected conflicts between their work. While conflicts, in some cases, are unavoidable, letting developers know earlier about the existence of conflicts and their potential impact can reduce effort required to resolve them. Crystal speculates code merges and lets developers know the instant they create a conflict, (1) allowing them to resolve conflicts while the relevant information is fresh in their minds, and (2) increasing their confidence in their changes that will not adversely affect others.

For more information on this project, as well as to download Crystal, see the Crystal page.

The best publication to read on the project is Proactive Detection of Collaboration Conflicts.

Collaborators: Reid Holmes, Michael D. Ernst, and David Notkin.


Synoptic: Summarizing System Logs with Refinement

To help developers understand existing systems' behavior, a tool called Synoptic automatically mines formal models of system executions from logs. Synoptic aids debugging and verification by describing the executions' key properties. We showed a developer a Synoptic model of over 900,000 executions of a real-world distributed system and within five minutes, the developer: (1) identified a concurrency bug and (2) verified the presence of an existing bug, discovering it occurred in more ways than expected.

For more information on this project, as well as to download Synoptic, see the Synoptic page.

The best publication to read on the project is Leveraging Existing Instrumentation to Automatically Infer Invariant-Constrained Models.

Primary collaborators: Ivan Beschastnikh, Michael D. Ernst, Arvind Krishnamurthy, and Thomas E. Anderson


Reliability through Smart Redundancy

Today's engineered systems require a design-time analysis of possible failures and an encoding of detection mechanisms and adaptation strategies the systems can employ when failures occur. This phenomenon requires the engineer to make a number of decisions about the system at design time. However, better decisions can, in most cases, be made at runtime because strictly more information about the environment and system state is available.

Software systems often use large numbers of unreliable components to perform computations or store data. Redundancy allows for these systems to be reliable. I developed iterative redundancy, a technique that leverages information only available at runtime to increase the reliability of the system. When the reliability of a certain task is low — iterative redundancy injects extra resources; when the reliability is high — iterative redundancy withdraws resources to use elsewhere. Because of the wealth of runtime information, iterative redundancy is provably optimal with respect to the resources it uses. That is, systems employing iterative redundancy are guaranteed to either (1) achieve the maximal possible reliability for specified available resources, or (2) use an optimally minimal number of resources to achieve a specified system reliability target. Iterative redundancy is optimal and cannot be outperformed by other redundancy techniques: it is guaranteed to squeeze all the reliability redundancy techniques can get out of the resources.

The best publication to read on the project is Smart redundancy for distributed computation.

Collaborators: George Edwards, Jae young Bang, and Nenad Medvidović.


Privacy Preservation in Distributed Computation via sTile

In my Ph.D. dissertation work, I developed a nature-inspired technique for engineering software systems that distribute computation onto large untrusted networks, such as the Internet, in a trustworthy manner. sTile allows one to use a large network to solve a computational problem without telling the nodes on the network the algorithms and input to that computation. For example, one can use sTile to decode an encrypted message with the help of the Internet, without anyone on the Internet knowing the encryption and the encoded and decoded messages.

More information on this project is available on the sTile page.

The best publication to read on the project is Preserving Privacy in Distributed Systems.

Collaborator: Nenad Medvidović.


DNA Self-Assembly

For several years, I worked with Prof. Leonard Adleman on DNA self-assembly. I studied mathematical models of molecular interactions and used them to create nanoscale designs out of DNA. This work has incredible potential for bottom-up design for electronic components of the future.

The best publications to read on the project are (for theoretical work) Solving NP-Complete Problems in the Tile Assembly Model, and (for experimental work) Self-Assembly of DNA Double-Double Crossover Complexes into High-Density, Doubly Connected, Planar Structures.

Collaborators: Dustin Reishus, Manoj Gopalkrishnan, Bilal Shaw, Nickolas Chelyapov, and Leonard Adleman.


The Fault-Invariant Classifier

In completing my Masters degree at MIT, I worked on a tool to help software engineers find errors in programs.

Software engineers devote the majority of their project's time to debugging programs. Even after extensive testing, no guarantee can be provided that the software project actually follows its specification. The Fault Invariant Classifier is a technique that assists programmers in locating faults in code. It uses models of known faults to discover similar faults in other code. The novelty and power of the technique lie in the ability to base models on examples of code errors, rather than formal definitions, and extending those models to apply to different errors.

The Fault Invariant Classifier extracts properties of code with known errors, and classifies those properties as fault-revealing and non-fault-revealing. Fault-revealing properties appear in faulty code and also in correct code, whereas non-fault-revealing properties appear in all types of code. The Fault Invariant Classifier extracts features from the properties and defines each property as a multidimensional vector. Those vectors are used to train a machine learning model and later classify properties of user code.

As part of my research I have built an implementation of the Fault Invariant Classifier that uses Daikon to dynamically extract properties, or invariants, and two machine learning algorithms, C5 and SVMfu. C5 is an implementation of the decision tree algorithm and SVMfu is an implementation of the support vector machine algorithm.

The Fault Invariant Classifier is available as part of the Daikon invariant detector software.

The best publication to read on the project is Finding Latent Code Errors via Machine Learning Over Programs Executions.

Collaborator: Michael D. Ernst.