This project resulted in a prototype system for wide-area communication that is reliable, timely, and efficient. This system is described in our ICDCS 2017 paper "Timely, Reliable, and Cost-Effective Internet Transport Service using Dissemination Graphs". The goal of this system was to be able to route data across long-distances (e.g., cross-country or trans-Atlantic) in a way that is both reliable and fast. Traditional approaches are either reliable but slow or fast but unreliable. We wanted to get the best of both worlds (fast and reliable) without incurring too much extra cost. We were particularly motivated by applications requiring haptic feedback (such as remote surgery), but our system is more broadly applicable to any setting where speed, reliability, and cost are all important factors.
Our main idea was to route data not just along paths, as in traditional approaches, but rather to use more complicated routes we call "dissemination graphs". By choosing the dissemination graph appropriately, we can provide extra reliability relatively cheaply by only reinforcing parts of the routes that are prone to experiencing failure, rather than equally reinforcing across the routes. This enables a high degree of reliability without costing too much.
Our prototype system is remarkably successful at achieving our goals. For example, on a real country-scale overlay network on which we test for a month, our system was able to close the "reliability gap" (the gap between the most reliable possible system and the least reliable system) by 99.81% at a cost of only 2.098 times the least reliable system, while achieving maximum reliability would have cost 15.75 times as much. In other words: by paying twice as much as a totally unreliable system, we got over 99% of the maximum reliability possible, which would have cost 15 times as much to achieve.
In addition to our system itself, the study of dissemination graphs led us to a set of important theoretical problems in algorithm design. Namely: how can we compute the best dissemination graphs? We designed extremely efficient algorithms for many cases (so-called "fixed-parameter tractable algorithms"), but also showed that there are some cases when such algorithms do not exist. For these cases we are forced to use suboptimal dissemination graphs. We also generalized these results, giving a complete characterization of what settings allow for efficient algorithms that find optimal dissemination graphs and which settings do not. Hence in the future we will be able to look at a particular setting and immediately know whether it is possible to design an efficient algorithm for finding dissemination graphs, or whether we will be forced to find suboptimal graphs.
Last Modified: 11/20/2019
Modified by: Michael Dinitz