Fig. 1: A neuro-symbolic regression workflow to systematically guide model discovery in social science.
From: AI-assisted discovery of quantitative and formal models in social science

Analogous to the inductive-deductive reasoning process, a dataset of interest (1)—which may be time-series, cross-sectional, or longitudinal—is supplied to OccamNet. The user can provide inductive priors (2), such as the choice of key variables, known constants, or specific functional forms to constrain the search space. OccamNet finds interpretable and compact solutions that model the input data by sampling functions from an internal probability distribution represented using P-nodes (Costa et al. 2021). In this example, OccamNet recovers the governing equation of the Solow-Swan model of economic growth (Solow, 1956) from a synthetic dataset. Each formal model is characterized by its error distribution in the training set (3), allowing the user to identify outliers and interrogate its internal validity. The symbolic model is then used to generate predictions (4) to perform deductive tests across unseen data, either by partitioning a test set or informing experimental designs (5).