Most scientific instruments currently discard rich streams of commands, data and metadata from which AI systems could learn to conduct experiments with expert-level decision-making and troubleshooting skills. Recording and using this data at scale requires rethinking what data to store, incentivizing large-scale cooperation, and determining how to quantify the reliability of such autonomous systems.