Fig. 3: Towards defining endogenous and synthetic reporters’ phenotypic potential via TFBS enrichment ranking.

a Scatter plot showing the mesenchymal sLCRs TFBS affinity ratio for on-target, off-target and scrambled sLCRs. The Y-axis indicates the observed/expected ratio (i.e., MGT1-2 observed/input TFBS). The X-axis denotes the number of input TF. First-generation and LSD-sLCR are indicated. Scrambled sLCR were designed using LSD and input from random sampling of TFs from the general pool of annotated human TFs (random TF) or random selection of genes from the human genome (random Sign-TF). Fitted lines indicate LOESS regression with 95% confidence interval. b Scatter plot showing the TFBS affinity ratio as a function of increasing numbers of CREs. Values are calculated for each functional sLCR assessed experimentally (Fig. 2). Logarithmic regression was used to fit the curve. The gray dashed line indicates that the CRE ratio is >50% of TFBS with R2 = 0.96 and the blue solid line marks MGT4. c Scatter plot showing the signature score (x-axis) and affinity score (y-axis; see Methods) of the indicated reporters for the mesenchymal phenotype. Note the antagonistic phenotypic scoring of glioblastoma (reds) and neural retina amacrine cell reporters (blues). d Phenotypic scoring of the same reporters in c for a retina amacrine cell phenotype. sLCR synthetic locus control region, LSD logical synthetic cis-regulatory DNA, TF Transcription Factor, TFBS Transcription Factor binding site, CRE cis-regulatory element. Source data are provided in the Source Data file.