TRACTUS: Understanding and Supporting Source Code Experimentation in Hypothesis-Driven Data Science

Paper at ACM CHI '20

by Krishna Subramanian, Johannes Maas, and Jan Borchers


Get the addin here!

Abstract

Data scientists experiment heavily with their code, compromising code quality to obtain insights faster. We observed ten data scientists perform hypothesis-driven data science tasks, and analyzed their coding, commenting, and analysis practice. We found that they have difficulty keeping track of their code experiments. When revisiting exploratory code to write production code later, they struggle to retrace their steps and capture the decisions made and insights obtained, and have to rerun code frequently. To address these issues, we designed TRACTUS, a system extending the popular RStudio IDE, that detects, tracks, and visualizes code experiments in hypothesis-driven data science tasks. TRACTUS helps recall decisions and insights by grouping code experiments into hypotheses, and structuring information like code execution output and documentation. Our user studies show how TRACTUS improves data scientists' workflows, and suggest additional opportunities for improvement. TRACTUS is available as an open source RStudio IDE addin at http://hci.rwth-aachen.de/tractus.

Video

Authors

Krishna
Subramanian

Johannes
Maas

Jan
Borchers

Publications

    2020

  • Krishna Subramanian, Johannes Maas and Jan Borchers. TRACTUS: Understanding and Supporting Source Code Experimentation in Hypothesis-Driven Data Science.  In Proceedings of the 2020 CHI Conference on Human Factors in Computing Systems, CHI '20, pages 10, ACM, New York, NY, USA, April 2020.
    HomepageMoviePDF DocumentBibTeX Entry

 

 

We use cookies on our website. Some of them are essential for the operation of the site, while others help us to improve this site and the user experience (tracking cookies). You can decide for yourself whether you want to allow cookies or not. Please note that if you reject them, you may not be able to use all the functionalities of the site.