Figure 3: Example provenance graph of a multistep workflow showing interaction between the analysis of three researchers.

The provenance record consists of two types of nodes—activities (shown as red boxes above) performed by a researcher and input and output files of these actions (shown as file and folder icons and identified by their name and Synapse ID). In addition, every activity has metadata associated with it to further describe the details of the actions performed. This specific graph shows the workflow used to perform comparative analysis of two mutation-calling algorithms—MuSiC and MutSig. For MuSiC, the provenance of analysis is displayed from input data to derivation of mutation calls. Provenance records may be further expanded (ellipses) to trace the origin of input files to their original data source in Firehose, DCC or personal communications with AWG members. For brevity, the MutSig graph is not expanded. This graph was produced from version 2 of the data in doi:10.7303/syn1750331.