Adam M. Costello, advisor
Randy H. Katz
Computer Science Division
University of California at
Berkeley
In the course of working on the Tangle project, the need arose to simulate endtoend network delays. Since communication in Tangle is connectionless, intermittent, and very lowbandwidth, there was no need to simulate the network topology; all I needed was the endtoend delays. I looked around for a model, but could not find one that people had confidence in.
I did, however, find a tool for measuring network delays: king
, by Krishna P. Gummadi
et al. It measures the roundtrip time (RTT) between DNS
servers. Since most DNS servers are willing to perform recursive
queries, it is possible to measure the RTT between two remote points;
there is no need to install instrumentation on machines scattered around
the internet.
This suggested a solution to my problem: Choose N points at random on the internet, measure the RTT between every pair, and mimic those RTTs (“ape” them) in the simulation.
Actually, I wanted more than a single number for each pair; I wanted a probability distribution. That could be achieved by collecting many sample RTTs for each pair, at different times. These can be sorted, filtered, and interpolated into a smooth commulative distribution function, which can be used to generate random delays. Since the communication is intermittent, we can neglect time dependence and generate each delay independently.
But there was a scaling problem: Measuring every pair among N nodes would require O(N^2) time, and storing the data would require O(N^2) space. But a simulation of N nodes typically requires only O(N (log N)^k) time and space (that is true for Tangle), so this simple solution would quickly become the limiting factor of the simulation.
This problem was overcome by introducing one level of recursion. Let n = sqrt(N), and measure every pair among n points. Now the data takes O(n^2) = O(N) time to collect and O(n^2) = O(N) space to store. In the simulation, imagine that each of the n nodes is a gateway to a local region, which looks like a scaleddown copy of the whole nnode cloud. There are now (n)(n1) = N (almost) simulated endpoints, and the total RTT between two endpoints is the sum of three segments (two local and one widearea) unless the two endpoints happen to be in the same local region. The scale factor for region i is the ratio of the min and max of the median RTTs between node i and all the other nodes, which is just enough to ensure that between any pair of endpoints neither local segment median is greater than the widearea segment median.
Of course this model is bogus; the internet does not really look like that. But since the scale factors tend to be large, the local segments tend to be small compared to the widearea segments, so whatever interesting phenomena were measured in the nnode cloud (like violations of the triangle inequality) will presumably still be true in the simulated Nnode cloud. In any case, something had to be done to address the scalability problem, otherwise the whole approach is useless.
The available king
tool is designed to measure one
RTT. I needed to measure more than a million, so DNS server failures
were inevitable, and needed to be retried. But transient failures
needed to be distinguished from persistent failures, because retries
are slow and waste a lot of time. I ended up writing a new program,
kingpairs
, from scratch, using the same core technique as
king. I also wrote various scripts to drive kingpairs
and process the output. All of that code, and the data collected in
2003Apr, are available via the links at the top of this page.
For the code that generates the simulated delays based on the data, see the Tangle project.
