Introduction

Shiga toxin-producing Escherichia coli (STEC) causes diarrhoea, haemorrhagic colitis (HC), and life-threatening complications, including haemolytic uraemic syndrome (HUS). STEC infections are prevalent globally, particularly in Southeast Asia, Europe, and North America. In Japan, there are approximately 3000 cases annually1. The principal virulence factor of STEC strains is Shiga toxin (Stx), an AB5-type toxin that disrupts protein synthesis by cleaving a specific adenine residue in the 28S rRNA within host cells, ultimately leading to apoptosis2. There are two antigenically distinct Stx types, Stx1 and Stx2, which are further categorized into at least three (Stx1a, -c, and -d) and fifteen (Stx2a–o) subtypes, respectively3,4,5,6,7. Stxs are encoded by phages and are transmitted between E. coli and related species8. Epidemiological studies indicate that Stx2a-producing strains are more strongly associated with severe cases, including HUS, than other strains are2,9.

O157 is the predominant STEC serogroup in many countries, and most O157 strains produce Stx2a with or without Stx1a1,10. In 2011, a large-scale outbreak caused by Stx2a-producing O104 occurred in Europe, primarily in Germany11. Various other STEC serogroups are also prevalent globally. Among these non-O157 serotypes, O26, O45, O103, O111, O121, and O145 constitute the majority of clinical infections and are commonly referred to as the “Big 6”12. Many strains within the Big 6 produce Stx1a, Stx2a, or both.

In addition to Shiga toxins, certain STEC strains possess a type III secretion system (T3SS) encoded by the locus of enterocyte effacement (LEE). Effectors secreted by the T3SS inhibit or modify various host functions and enhance adhesion through the formation of attaching-effacing (A/E) lesions in intestinal epithelial cells, significantly contributing to virulence13. Indeed, LEE-positive STEC strains have been shown to be closely associated with the onset of HUS9. Most strains of O157 and the Big 6 non-O157 serotypes are LEE-positive12. In addition to seven effectors in the LEE region, there are non-LEE effectors on phages and other mobile elements14. Additionally, numerous accessory virulence genes, such as those encoding haemolysin and proteases, are also believed to play a role in virulence15.

Ruminant animals, particularly cattle, constitute the primary reservoir for STEC strains, with contaminated food and water serving as the principal sources of STEC infection16. Owing to its high infectivity, secondary person-to-person transmission of STEC strains within households is also frequent in both sporadic cases and outbreaks1,17. Additionally, undiagnosed asymptomatic STEC carriers may facilitate secondary infections. It is estimated that 15% of STEC infections can be prevented by mitigating person-to-person transmission18. Infection prevention strategies, such as the sanitary isolation of STEC patients with typical symptoms, are crucial and indispensable for curbing person-to-person spread. Moreover, the increased use of multipathogen detection systems in the diagnosis of diarrhoeal patients has significantly increased the prevalence of sporadic identification of STEC strains, independent of clinically apparent disease or outbreaks19. In Japan, following a major STEC outbreak in Sakai city in 1996, food safety control measures and a comprehensive STEC surveillance system were established, making STEC infections notifiable in all cases, including asymptomatic carriers20. Furthermore, to prevent the transmission of STEC through food, individuals with STEC infections, regardless of their symptoms, are legally prohibited from working as food handlers in manufacturing, preparation facilities, and restaurants. Consequently, the Ministry of Health, Labour and Welfare requires food handlers to undergo routine stool testing for various infectious pathogens, including STEC21. Additionally, employees in social welfare facilities, such as nursery schools and elderly care facilities, are required to undergo regular faecal testing. Notably, previous research on asymptomatic STEC carriers identified several minor serotypes distinct from those causing severe infections, suggesting they pose a lower risk22,23,24. However, genomic studies on these strains remain limited. We thus analysed nearly 500 strains derived from healthy human adults and elucidated their genomic characteristics and virulence potential.

Results

Classification of serotypes associated with asymptomatic carries and patients

In stool tests conducted by the Japan Microbiological Laboratory on food handlers and social welfare facility workers across Japan in 2021, the STEC detection rate was 0.035%. A total of 521 strains were isolated from the positive samples, excluding duplicates, and the draft genome sequences of 495 strains were successfully determined (Supplementary Data 1). The predominant serotype of the STEC strains isolated from the asymptomatic carriers was O156:H25 (13.5%), followed by O174:H21 (7.7%) and O105:H7 (6.1%) (Table 1). In contrast, the Infectious Agents Surveillance Report by the National Institute of Infectious Diseases indicates that the most prevalent serotype among STEC strains isolated from symptomatic patients in Japan between 2011 and 2021 was O157:H7(H-) (64.5%), followed by O26:H11(H-) (18.5%) and O111:H8(H-) (3.9%). Based on statistical analyses of serotype distributions between patient-derived and healthy-carrier-derived isolates, nineteen serotypes were statistically identified as being significantly associated with asymptomatic carriers (SAAC) and 60.6% of the strains isolated from asymptomatic carriers were classified as SAAC strains (Table 1). Five serotypes were identified as being significantly associated with patients (SAPA), and 92.4% of the strains isolated from symptomatic patients were classified as SAPA strains. The genome sequences of 250 strains, with 50 representatives from each SAPA, were randomly selected and extracted from the public database for subsequent analysis (Supplementary Data 1).

Table 1 Summary of the serotypes of strains isolated from the asymptomatic carriers and patients

Risk assessment of strains belonging to SAAC and SAPA

Core gene-based phylogenetic analysis revealed that the SAAC strains were phylogenetically indistinguishable from the SAPA strains (Fig. 1A). All the SAPA strains were LEE-positive, whereas 38.3% of the SAAC strains were LEE-positive. The prevalence of stx1 and stx2 exhibited a similar pattern in both SAAC and SAPA strains, with more than half of the strains in each group harbouring stx2 (Fig. 1B). Classification by stx1 subtype revealed that all the stx1 genes in the SAPA strains were stx1a, whereas 80.7% of the stx1 genes in the SAAC strains were stx1a; the remaining genes were stx1c (Fig. 1C). Notably, classification by stx2 subtype revealed that 88.8% of the stx2 genes in the SAPA strains were stx2a, whereas only 3.7% of those in the SAAC strains were stx2a. stx2 in the SAAC strains was classified into nine subtypes, with stx2f being the most prevalent (31.2%), followed by stx2e (22.4%). These stx2 subtypes were disseminated among STEC strains from the asymptomatic carriers in a largely lineage-independent manner (Fig. 1A). More importantly, none of the SAAC strains were simultaneously positive for both LEE and stx2a.

Fig. 1: The phylogenetic relationships and stx subtype distributions in the strains classified as SAAC and SAPA.
Fig. 1: The phylogenetic relationships and stx subtype distributions in the strains classified as SAAC and SAPA.The alternative text for this image may have been generated using AI.
Full size image

A A maximum likelihood (ML) tree based on the core genes depicts the phylogeny of STEC strains classified as SAAC (n = 300) and SAPA (n = 250), highlighting the distribution of LEE and stx subtypes. The ML tree is based on 91,555 SNP sites in 1,818 core genes. The tree was rooted by the cryptic Escherichia clade I strain TW15838. B Toxin types and (C) stx subtypes are summarized. Source values for Fig. 1C can be found in Supplementary Data 1.

We then analysed the distribution of genes encoding non-LEE effectors and accessory virulence factors in the LEE-positive SAAC and SAPA strains. The eae subtypes were consistent with the phylogenetic distribution, whereas the distributions of non-LEE effectors and accessory virulence factors exhibited various patterns (Fig. 2A). Notably, compared with the SAPA strains, the SAAC strains presented a significantly lower prevalence of both non-LEE effectors and accessory virulence factors (Fig. 2B and c). Seventeen effectors and eight accessory virulence factors were more prevalent in the SAPA strains than in the SAAC strains, whereas only one non-LEE effector and five accessory virulence factors were more prevalent in the SAAC strains than in the SAPA strains (Fig. 2D). Specifically, among the non-LEE effectors, espN, espW, espX, and nleL were present in approximately 80% or more of the SAPA strains but in less than 20% of the SAAC strains. Among the accessory virulence factors, efa1, espP, ihaA, and katP were apparently less prevalent in the SAAC strains than in the SAPA strains.

Fig. 2: Distribution of virulence factors among the LEE-positive STEC strains classified as SAAC and SAPA.
Fig. 2: Distribution of virulence factors among the LEE-positive STEC strains classified as SAAC and SAPA.The alternative text for this image may have been generated using AI.
Full size image

A Core-gene-based ML tree of LEE-positive STEC strains classified as SAAC (n = 115) and SAPA (n = 250) and the distribution of genes encoding non-LEE effectors and other E. coli virulence genes in these strains. The ML tree is based on 96,481 SNPs located in 2099 core genes. B The total counts of non-LEE effector genes and (C) other virulence genes are summarized. D Summary of the prevalence of each non-LEE effector gene and other virulence genes. The numbers of LEE-positive SAAC and SAPA strains were 115 and 250, respectively. Statistical analyses were performed via the Wilcoxon rank sum test (b and c) and Fisher’s exact test (d). ***P < 0.0001; **P < 0.01. Source values for B, C, and D can be found in Supplementary Data 1.

Risk group classification of STEC strains from asymptomatic carriers

We classified STEC strains isolated from asymptomatic carriers into risk groups. Strains belonging to SAPA serotypes (O157:H7(H-), O26:H11(H-), O111:H8(H-), O121:H19(H-), O145:H28(H-)) and other serotypes harboring both LEE and stx2a were classified as high-risk (Table 2). Strains of other serotypes harboring either LEE or stx2a were classified as moderate-risk, and those harboring neither LEE nor stx2a were classified as low risk. Out of 495 STEC isolates from the asymptomatic carriers, 10, 16, 3, 1, and 1 strains were identified as belonging to the SAPA serotypes, O157:H7, O26:H11, O111:H8, O121:H19, and O145: H-, respectively, constituting 31 strains in total (Table 3). All of these strains were LEE-positive, and four O157:H7 strains along with one O111:H8 strain harbored stx2a. The SAPA strains from asymptomatic carriers also exhibited high conservation of various effectors and other virulence factors, including espN (96.8%), espW (90.3%), espX (96.8%), efa1 (74.2%), espP (74.2%), ihaA (71.0%), and katP (77.4%), although nleL was absent in all strains (Supplementary Data 1). We considered these 31 strains as high-risk STEC based on their serotypic profiles (Table 3).

Table 2 Risk group classification of STEC strains from asymptomatic carriers
Table 3 Prevalence of LEE and stx subtypes in STEC from the asymptomatic carriers classified as SAPA

Among the STEC isolates from the asymptomatic carriers, 164 were neither classified under SAAC nor SAPA serotypes and were instead categorized as Not Significant (NS) or Minor Serotypes (MS) with five or fewer isolates per serotype (Table 1). These isolates were distributed across 152 distinct serotypes, with 12 strains remaining untypeable by in silico O-serotyping. O103:H2 was the most frequently identified serotype (20 strains). Of the NS and MS strains, 36 were LEE-positive and 28 carried stx2a (Table 4). Four isolates contained both LEE and stx2a, corresponding to the serotypes O103:H11, O108:H25, O150:H2, and O165:H25. These four strains possessed a range of non-LEE effectors, including espH, espJ, espL, espO, ibe, nleA, nleB, nleE, and nleH, as well as virulence factors such as pchA, ehxA, and paa (Supplementary Data 1). Additional effectors and virulence factors were also relatively well-conserved across these strains. These four strains were genetically considered as high-risk STEC (Table 2).

Table 4 Prevalence of LEE and stx2a in STEC strains from the asymptomatic carriers classified as NS or MS

In addition to the 35 high-risk STEC strains, among the 495 STEC isolates from asymptomatic carriers, 147 were LEE-positive and 31 were stx2a-positive. These 178 strains (36.0%) were classified as moderate-risk STEC, while the remaining 282 strains (57.0%) were designated as low-risk STEC (Table 2).

Genetic and phenotypic virulence potential of high-risk STEC strains from asymptomatic carriers

Further evaluation was conducted to assess the virulence potential of STEC isolates belonging to SAPA isolated from healthy carriers. O157:H7 strains has been classified into at least nine clades, among which clades 6 and 8 are significantly associated with HUS, whereas clade 7 is predominantly associated with healthy carriers25,26. Of the ten O157:H7 strains isolated from healthy carriers in this study, six belonged to clade 7 and four to clade 2 (Supplementary Table 1).

Subsequently, we performed long-read sequencing on one isolate each of the serotypes O157:H7, O111:H8, O121:H19, O145:H − , and O26:H11 among the STEC strains isolated from healthy carriers, in order to compare the structures of their Stx phages with those of clinical isolates obtained from publicly available genome data (Supplementary Fig. 1). Except for the Stx1a phage of the O111:H8 strain STEC-AC330, the Stx phages of the healthy carrier-derived strains exhibited no major deletions and showed high structural similarity to those of the clinical isolates. Although the Stx1a phage of the O111:H8 strain STEC-AC330 also displayed high overall sequence similarity to the Stx1a phage of a clinical isolate, it appeared to be a defective phage.

Finally, the stx2 expression levels were compared among high-risk, moderate-risk, and low-risk STEC strains. The results showed that stx2 expression was significantly higher in high-risk STEC than in the other groups (Fig. 3).

Fig. 3: Comparison of stx2 expression levels among risk groups of isolates from asymptomatic carriers.
Fig. 3: Comparison of stx2 expression levels among risk groups of isolates from asymptomatic carriers.The alternative text for this image may have been generated using AI.
Full size image

stx2 expression was measured following induction with mitomycin C in high-risk STEC (strains STEC_AC400, O157:H7, stx1a- and stx2a-positive), moderate-risk STEC (strain STEC_AC089, O105:H7, stx2e- and stx2f-positive) and low-risk STEC (STEC_AC142, O174:H21, stx2c-positive). Expression levels were normalized to the housekeeping gene gapA. Data represent the means of three independent experiments, and error bars indicate standard deviations. Statistically significant differences are indicated by bars, with corresponding P values shown. Source values for Fig. 3 can be found in Supplementary Data 1.

Discussion

In this study, 0.035% of healthy Japanese adults were positive for STEC. Asymptomatic STEC carriers have been reported at a prevalence of 1% among healthy children in France, 0.5% among healthy children in Germany, and 0.08% among healthy adults in Japan23,27,28. Many countries impose strict regulations on asymptomatic STEC carriers, even though their role in outbreaks remains debated19. However, restricting asymptomatic carriers from work can lead to economic hardship, psychosocial stress, and social stigma. In Japan, individuals identified as STEC carriers are legally prohibited from working as food handlers or in social welfare facilities21. Consequently, food handlers and social welfare facility workers who shed STEC for extended periods are often decolonized with antibiotics. However, the use of antibiotics in clinically asymptomatic individuals requires proper justification19. Moreover, there are concerns that certain antibiotics may promote Stx production; thus, antibiotic therapy for STEC infection remains controversial29.

The 495 STEC strains from asymptomatic carriers were classified into more than 100 serotypes, exhibiting extensive phylogenetic and genetic diversity. STEC represents a genetically diverse group that has evolved through horizontal gene transfer of virulence factors such as Stx, resulting in wide variation in virulence potential30,31,32,33. A previous study revealed that many STEC isolates from healthy adults belong to O serogroups which are rarely found in STEC isolates from symptomatic patients23. However, only two small-scale genomic studies of STEC from asymptomatic carriers have been conducted to date, comprising 27 and 10 strains, respectively22,24. In these studies, in addition to stx2a, various other stx subtypes, including stx2d and stx2e but not stx2f, were detected in STEC from healthy carriers, with a high prevalence of LEE-negative strains. Our extensive genomic analysis of STEC strains from the healthy carriers revealed 19 serotypes that were significantly associated with asymptomatic carriers, designated SAAC (Table 1). Among these serotypes, the prevalence rates for LEE and stx2a, which are risk factors for severe disease, were 38.3% and 3.7%, respectively, with no strains positive for both factors. Conversely, among strains of the five serotypes identified as being significantly associated with symptomatic patients, designated SAPA, 88.8% were positive for both LEE and stx2a. Among the stx2 subtypes in the SAAC strains, stx2f and stx2e predominated, accounting for 31.2% and 22.4%, respectively (Fig. 1). The pathogenicity of Stx2e to humans remains unclear, and no outbreaks with severe infection caused by Stx2f-producing strains have been reported. Additionally, stx2 expression levels are significantly lower in stx2c-, stx2d-, and stx2e-positive strains than in stx2a-positive strains24. These results indicate that a significant proportion of SAAC strains pose a low risk for the development of severe disease.

In the analysis of LEE-positive strains, the SAAC strains presented a lower prevalence of certain non-LEE effectors and accessory virulence factors than did the SAPA strains (Fig. 2). Specifically, the prevalence of espN, espW, espX, and nleL was markedly lower in the SAAC strains. While the function of EspN remains undefined, EspX and NleL are recognized as ubiquitin ligases, and EspW is involved in cytoskeletal remodelling (Supplementary Table 2). In terms of accessory virulence factors, the frequencies of espP, ihaA, and katP were significantly lower in the SAAC strains than in the SAPA strains. These genes are believed to play a role in colonization and survival within the host (Supplementary Table 2). The limited presence of non-LEE effectors and accessory virulence factors in SAAC strains may diminish their virulence potential.

In this study, among the strains isolated from food handlers and social welfare facility workers, 31 (6.2%) were identified as belonging to the five serotypes categorized as SAPA, including O157:H7, while four strains (0.9%) were not classified as SAPA but harboured both LEE and stx2a (Tables 2 and 3). None of the ten O157:H7 strains isolated from asymptomatic carriers belonged to clades 6 or 8 (Supplementary Table 1), which are known to be strongly associated with severe infections and outbreaks25,26. However, although only a limited number of strains were examined, Stx prophages in these high-risk STEC were mostly intact, and their stx2 expression levels were higher than those of moderate- and low-risk STEC (Fig. 2 and Supplementary Fig. 1). Although the carriers were asymptomatic, these high-risk STEC strains likely retain considerable virulence potential and may act as reservoirs of virulence genes through horizontal transfer, emphasizing the need for rigorous monitoring and control of their carriers. On the other hand, high-risk STEC carriers, and all others, should not be managed by uniform criteria; individualized strategies based on risk stratification for each strain are needed. Future studies should include larger-scale analyses to identify low-risk serotypes typical of asymptomatic carriers and improve risk assessment using STEC detection, serotyping, and stx subtyping. Among the 19 serotypes classified in the SAAC strains in this study, nine were untypable by routine serotyping with antisera. Establishing an antiserum set or PCR system is imperative for the accurate identification of SAAC strains.

A limitation of this study is that we could not assess the pathogenicity of SAAC strains using cell culture or animal models. Although infection models for STEC are not fully established, there are various animal models that mimic parts of the disease process34,35,36. Furthermore, future research should include not only larger-scale analyses of strains derived from healthy carriers in Japan but also STEC strains from healthy carriers globally.

In conclusion, this study identified serotypes characterized as STEC from healthy carriers. Strains associated with these serotypes are suggested to have a low prevalence of risk factors for severe infection and other virulence factors, indicating a low virulence potential. 7.1%, 36.0% and 57.0% of the 495 STEC strains from asymptomatic carriers were finally classified into high-, moderate-, and low-risk STEC groups. In asymptomatic STEC carriers, an individualized, risk-adapted approach is essential and would necessitate stringent hygiene measures exclusively for high- and moderate-risk strains while limiting unnecessary precautions against low-risk STEC. Return-to-work policies tailored to the virulence of STEC strains may mitigate the personal and socioeconomic burden in cases of asymptomatic prolonged shedding of low-risk STEC.

Methods

Strains used in this study

At the Japan Institute of Microbiology, we have a bank of STEC strains isolated from STEC-positive samples via routine stool tests of food handlers and social welfare facility workers. In 2021, a total of 521 STEC strains were isolated from 1,494,662 stool samples. Stool samples were initially screened by PCR to identify stx-positive specimens. The PCR-positive samples were subsequently plated directly onto selective media including CHROMagar STEC medium (Kanto Chemical Co., Inc., Tokyo, Japan), and colonies obtained were further analysed by PCR to confirm and isolate STEC strains. The samples were collected from healthy individuals without any symptoms of intestinal infection, and multiple isolates obtained from the same individual were excluded from the genomic analysis. Additionally, in Japan, STEC infections are classified as notifiable diseases, requiring comprehensive reporting, including information on serotypes, toxin types, and patient symptoms, which is collected by the National Institute of Infectious Diseases and published annually. From 2011–2021, we gathered data on the serotypes and toxin types of strains from patients exhibiting symptoms such as diarrhoea, abdominal pain, bloody stool, or haemolytic uraemic syndrome (HUS). We obtained genomic data for 250 strains, including 50 strains from public databases, representing the five SAAC serotypes, and incorporated them into the analysis (Supplementary Data 1).

In Japan, routine stool screening for STEC is legally mandated for food handlers and workers in social welfare facilities to prevent secondary transmission. The present study used only bacterial isolates obtained and stored from these routine surveillance tests. No personal identifiers or sensitive data of the individuals were included. Therefore, separate ethical approval was not required.

Genome sequencing

Genomic DNA was extracted from 1 ml of overnight culture of each STEC strain via the DNeasy Blood and Tissue Kit (Qiagen). Genomic DNA libraries were constructed via the xGen DNA Library Prep EZ Kit (Integrated DNA Technologies) in combination with NEBNext Multiplex Oligos for Illumina (96 Unique Dual Index Primer Pairs) (New England BioLabs). Sequencing was performed on the Illumina HiSeq Ⅹ Ten platform, generating paired-end reads of 151 base pairs.

Bioinformatical analysis

Both the raw sequencing data generated in this study and those retrieved from public databases were assembled using Platanus_B v1.3.237. In silico serotyping was conducted using ECTyper or SRST238,39. Subtyping of stx and eae was performed using SRST2 and ABRicate (https://github.com/tseemann/abricate), respectively, with a threshold of ≥99% sequence identity. The presence or absence of non-LEE effectors was assessed by TBLASTN analysis, which uses the amino acid sequence of the effector as the query and the assembled sequence as the database, with criteria of 50% or greater homology to the query sequence and 50% or greater coverage. The presence or absence of other virulence factors was determined using ABRicate under the default settings.

To construct a core gene-based phylogenetic tree, pangenomic analysis was carried out using Roary v3.13.3 with a 90% sequence identity threshold40. SNP sites were extracted from the core gene alignment using SNP-sites v2.5.1 with the -c option41. A maximum likelihood phylogenomic tree was subsequently generated using RAxML-NG ver. 1.0.1 with the following parameters: --all, --bs-trees 100, and --model GTR + G442. The ML tree was visualized and annotated using iTOL v6.643. Clade classification of O157:H7 was performed based on the SNP typing method described previously25,26,44.

Quantitative PCR

Overnight cultures of each STEC strain were diluted in LB medium to an optical density at 600 nm (OD₆₀₀) of 0.1 and incubated at 37 °C with shaking at 200 rpm for 110 min. Mitomycin C was then added to a final concentration of 0.5 µg/mL, and the cultures were further incubated for an additional 3 h under the same conditions. Total RNA was extracted using the RNeasy Protect Bacteria Mini Kit (Qiagen), and complementary DNA (cDNA) was synthesized with the PrimeScript RT Reagent Kit (Takara Bio). qPCR was performed using the THUNDERBIRD SYBR qPCR Mix (TOYOBO) with the synthesized cDNA as a template. The primer sequences used were as follows: stx1a forward 5′-GTGGCATTAATACTGAATTGTCATCA-3′ and reverse 5′-GCGTAATCCCACGGACTCTTC-3′45; stx2a, stx2c, and stx2e forward 5′-TCCATGACAACGGACAGCAG-3′ and reverse 5′-ACGCCAGATATGATGAAACCAG-3′24; stx2f forward 5′-AGAGGAGAGGAAGGGGTAAG-3′ and reverse 5′-TCACGGAACGAACTGAATAAC-3′46; and gapA forward 5′-TATGACTGGTCCGTCTAAAGACAA-3′ and reverse 5′-GGTTTTCTGAGTAGCGGTAGTAGC-3′24. Gene expression levels were normalized to the gapA gene and are presented as relative expression values.

Statistics and reproducibility

To identify serotypes significantly associated with asymptomatic carriers (SAAC) or symptomatic patients (SAPA), the frequencies of each serotype among strains from asymptomatic carriers and patients were compared using Fisher’s exact test (two-sided) in JMP Pro version 17 (SAS Institute, Cary, NC, USA). Serotypes that were significantly more prevalent (p < 0.05) in strains from asymptomatic carriers were classified as SAAC, whereas those significantly more prevalent in strains from patients were classified as SAPA. Serotypes without significant differences were categorized as Not Significant (NS). Serotypes with five or fewer strains in either asymptomatic carriers or patients were not analysed and were classified as minor serotypes (MS).

The significance of differences in the presence or absence of each non-LEE effector and other virulence genes between the SAAC and SAPA groups was also assessed using Fisher’s exact test (two-sided) in JMP Pro version 17. In addition, the total numbers of non-LEE effector genes and other virulence genes per strain were compared between SAAC and SAPA strains using the Wilcoxon rank-sum test in JMP Pro version 17. Differences in stx2 expression among strains were analysed using one-way ANOVA, followed by the Tukey–Kramer HSD test for multiple comparisons in JMP Pro version 18.

Reporting summary

Further information on research design is available in the Nature Portfolio Reporting Summary linked to this article.