Education Class C3
Title: Research Reproducibility in Embedded Learning
Instructor: Romain Jacob, ETH Zurich
Abstract: When designing their performance evaluations, researchers often encounter questions such as: How long should a run be? How many runs to perform? How to account for the variability across multiple runs? What statistical methods should be used to analyze the data? Despite the best intentions, researchers often answer these questions differently, thus impairing the replicability of evaluations and the confidence in the results.
Improving the standards of replicability in embedded systems has recently gained traction within the community. As an important piece of the puzzle, we have developed a systematic methodology that streamlines the design and analysis of performance evaluations, and we have implemented this methodology into a framework called TriScale.
This lecture introduces the main concepts of the methodology and lets you experiment with TriScale. By the end to the lecture, you will be able to:
– Understand the difference between replicability and reproducibility, and why these notions matter;
– Understand why performance evaluation experiments must be replicable to be meaningful;
– Understand the basics of statistics required to assess replicability;
– Answer questions such as “How many times should I repeat my experiment?” rationally;
– Use the TriScale framework to help you design your next experiments, analyze your data, and report your results in a (more) replicable fashion.
The methods and principles underlying TriScale are broadly applicable to performance evaluations in (embedded) systems and networking, including simulations, emulations, and experiments on physical hardware platforms. Throughout the lecture, we draw on networked embedded systems use cases to illustrate the main concepts. Finally, we conclude with a note on the replicability challenges for machine learning, and how TriScale may help to address those as well.
Bio: Roman Jacob is postdoctoral researcher at ETH Zurich in the group of Prof. Laurent Vanbever. His current research interests are focused on computer networks, communication protocols, (real-time) scheduling theory, and statistics applied to experimental design. He started to work on TriScale from his own need to design sound performance evaluations for low-power wireless communication protocols, which was the topic of his doctoral dissertation, supervised by Prof. Lothar Thiele. At the time, he has been heavily involved in the IoTBench initiative, which aims at designing benchmarks for low-power wireless; and as he learned then, there can be no proper benchmarking without replicability! So here he is.