Sigmod Experimental Repeatability
The goal of the repeatability/workability effort is to ensure that SIGMOD papers stand as reliable, referenceable works for future research. The premise is that experimental papers will be most useful when their results have been tested and generalized by objective third parties.
The repeatability & workability process tests that the experiments published in SIGMOD 2011 can be reproduced (repeatability) and possibly extended by modifying some aspects of the experiment design (workability). Authors participate on a voluntary basis, but authors benefit as well:
- Mention on the repeatability website
- The ability to run their software on other sites
- Often, far better documentation for new members of research teams
Preparing the Experiments
The repeatability committee focusses on repeating the derivation of presented data (table or graph) from initial data.
Derivation involves a system (and its environment), a workload, some metrics and a set of experiments. System setup is the most challenging aspect when repeating an experiment: the system needs to be installed in a new environment, both the system and part of the environment must be configured, and experiments must be run. Obviously, installation/ configuration will be easier to reproduce if it is automatic rather than manual. Similarly, experiments will be easier to reproduce if they are automatic (e.g., a script that takes a range of values for each experiment parameter as arguments) rather than manual (e.g., a script that must be edited so that a constant takes the value of a given experiment parameter).
Here are some example best practices:
(1) Use a VM as an environment for your experiments.
If you are working on the cloud, you might already be doing this. But it might make sense even if you are working with your own server. The latest VM engines (e.g., VirtualBox 3.2) have support for direct IOs, the overhead in terms of CPU usage is reasonable for most experiments (it can be calibrated if relevant), various types of network cards are available. In most cases, using a VM will not alter your performance; it will make it easier for you to setup and execute your experiments at different points in time (just freeze your VM and reuse it next year - it will not have changed so you can repeat your experiments immediately), or on various HW platforms. As a side effect, it will make it much easier for the rest of us to set up and execute your experiments.
(2) Represent separate setup experiment tasks as executable scripts
or programs with explicit pre- and post-conditions.
Clearly identifying each setup or experiment task makes it easier to define the pre/post conditions and check that they are fulfilled (e.g., What is the state of your flash device after it is formatted? Can you actually perform an experiment if a given setup task failed?)
(3) Use a workflow engine to automate the
We have prepared an extension of VistTrails and some examples based on database tuning experiments that illustrate the (large) benefits and the (modest) costs of this approach.
Here, we present an extended case study of preparing an experiment on a distributed database system for repatability and workability.
Repeatability Committee (February 14 to June 10)
Philippe Bonnet, chair (IT University of Copenhagen)
Stefan Manegold, advisor (CWI Amsterdam)
Stratos Idreos (CWI Amsterdam)
Milena Ivanova (CWI Amsterdam)
Ryan Johnson (EPF Lausanne)
Dan Olteanu (University of Oxford)
Paolo Papotti (Universita Roma Tre)
Nancy Hall (University of Wisconsin)
Rene Mueller (IBM Almaden)
Christine Reilly (U.Texas Pan Am)
Tim Kraska (UC Berkeley)
Cong Yu (Yahoo)
Dimitris Tsirogiannis (Microsoft)
David Koop (University of Utah)
Wei Cao (Remnin University)
Wolfgang Gatterbauer (University of Washington)
Jesse Lingeman (NYU)