Extended Data Fig. 1: Microexon sensitivity to SRRM4 expression is largely conserved from shark to human.

(a) Different microexon groups were defined according to their response upon titration of SRRM4. Supervised heatmap representing the ΔPSI for each event (rows) under different conditions (columns) with respect to cells expressing GFP. Conditions: HEK 293 Flp-In T-REx cells expressing SRRM4 in a doxycycline dependent manner, CTR are cells not treated with doxycycline (only leaky expression), while LOW, MID, HIGH are cells treated with increasing concentrations of doxycycline (Methods). The membership of the events to either LS/HS/CR/CS/NR microexons is indicated by the colored squares on the left side of the heatmap. The number of events for each category is shown within brackets. (b) Heatmap showing the pattern of splicing of the selected events both in the endogenous and in the MaPSy context (represented as the median of PSI values in the different libraries in which the events were quantified). White squares correspond to missing values due to insufficient read coverage. (c) The lines represent an estimate of the central tendency and the corresponding 95% confidence interval of the ΔPSI (VAR-Hsa) of orthologous WT sequences from all tested species with respect to human (Hsa) in GFP and LOW, MID, HIGH expression of SRRM4 conditions for HS (green) and LS (blue) events accordingly to the defined sensitivity in human (Fig. 1). The species are indicated at the bottom as well as the number of HS and LS events represented per species. Teleosts, which showed more dissimilar inclusion patterns relative to their corresponding human sequences even though their dose response was also conserved, are highlighted using a yellow background as in Fig. 1e. The red lines separate the species for each phylogenetic node. The numbers at the bottom show the results of two-sided Mann-Whitney tests between the corresponding node and all other ones combined for the condition HIGH(VAR-WT). (d) Top: Distribution of PSIs in various tissues for orthologous of LS (blue) and HS (green) events from Mus musculus, Rattus norvegicus, Bos taurus, Gallus gallus and Danio rerio. Bottom: mRNA expression levels (cRPKMs) of Srrm3 (yellow) and Srrm4 (pink) in each species. (e) Schematics of the sequences involved in each swapping experiment. Intronic sequences from either a human event (grey) or another species (black) were swapped to generate the chimeric constructs depicted at the top of each subpanel, where the exonic part is either from human or from another species, respectively. The lines represent an estimate of the central tendency and the corresponding 95% confidence interval of ΔPSI (VAR-WT) between the variants in which the microexon sequence is from human and both flanking introns are from its ortholog in the indicated species (left), or in which the microexon sequence is from a given species and both flanking introns are from its human ortholog (right). Species involved in the shuffling are listed along the x-axis together with the number of HS and LS events represented per species. (f) Correlation of PSI between 6 biological replicates in the condition of HIGH expression level of SRRM4. PSI* correspond to the final output of the quantification pipeline while PSIreg* values are recovered from an intermediate file (see Methods). ‘Rep’ indicates each of the 6 replicates. (g) Correlation of the 268 sequences present in both T1|T2 and T3|T4 in four experimental conditions (GFP and LOW, MID, HIGH expression of SRRM4).