Introduction

Watermelon is an important specialty crop in United States, primarily grown in the Southeastern region. Production has decreased over the last five decades, resulting in increased imports necessary to meet consumer demand1. Imports peaked in 2019, representing one-third of U.S. consumption. Powdery mildew was rated as the second most prevalent and consistent foliar disease in watermelon2. Powdery mildew outbreaks cause yield losses through decreased fruit weight and can also reduce fruit quality through sun-scalding from reduced leaf canopy3.

Powdery mildew is a fungal disease caused by several species in genus Podosphaera, affecting a wide range of crops across the globe. Different races of Podosphaera xanthii and Golovinomyces cichoracearum are responsible for causing powdery mildew on Cucurbits, especially melons, and races from one cucurbit can readily infect other cucurbit species. Lack of variation in host tolerance and high pathogen variability complicate race classification in watermelon4. Races of P. xanthii have primarily been classified using muskmelon (Cucumis melo) differentials. Twenty-eight races of P. xanthii, along with eight variants of race 1 and six variants of race 2, were identified based on 22 melon cultigens5. Two races of P. xanthii classified based on melon (Cucumis melo) differentials, that can also infect watermelon, have been reported in the U.S and designated as races 1 W and 2 W6,7. However, detailed classification of powdery mildew races based on watermelon differentials is not available. Evidence for the presence of two races based on watermelon differentials has been reported in U.S8. and similarly the potential for presence of numerous races based on watermelon differentials have been reported in Europe9. However, further studies are needed to define these races across US and Europe. Numerous fungicides are available for control of powdery mildew, but generally are more preventative than curative of an active infection10. Risk of evolution of fungicide tolerance is high and has already been reported for some fungicides in South Carolina and other states11,12, thus making host tolerance an essential component of disease management. Field evaluations of commercial watermelon cultivars found a single edible cultivar with tolerance to locally prevailing strains of powdery mildew that are predominantly race 1 (1 W) based on melon differentials13, highlighting the need for new cultivars with improved resistance. More recent field evaluations have identified a few seedless (triploid) cultivars with varying levels of tolerance to powdery mildew (race not reported)14. However, because of the presence of multiple races, new information with presence of new genetic loci potentially controlling resistance will be useful.

Disease screenings of the entire USDA Citrullus germplasm collection for response to artificial inoculation with P. xanthii races 1 W6 and 2 W7 have been reported, resulting in the identification of several accessions with strong resistance. Inheritance studies have found a complex genetic basis of resistance (e.g. partial dominance, incomplete dominance, multigenic and epistasis) that varies between populations and tissue types for both races15,16. Genome-based analyses of powdery mildew resistance in watermelon have been limited. QTL mapping of a biparental watermelon population segregating for resistance to P. xanthii race 1 W identified a single major QTL on chromosome 2, pmr2.1, that explained 80% of the variation in disease response17. They developed CAPS markers associated with resistance. Mandal et al. (2020) used comparative whole genome resequencing analysis of resistant inbred lines and their parents, using a different resistance source, and found a single major QTL for P. xanthii race 1 W resistance, collocated with pmr2.118. A genome-wide association study of P. xanthii race 2 W resistance using historical data for the USDA Citrullus collection identified 43 SNPs across nine chromosomes, including a strong signal of several significant SNPs collocated with pmr2.119. The GWAS employed low-density genotyping-by-sequencing (384-plex) which can result in false negatives in genomic regions of low coverage where there are no markers in linkage disequilibrium with the causal variant20. However, whole-genome resequencing of thousands of accessions is prohibitively expensive. A cost-effective alternative to traditional GWAS is bulked segregant analysis of extreme phenotypic pools from a diversity panel, termed an extreme-phenotype genome-wide association study (XP-GWAS)21. Along with reducing the number of accessions to be genotyped, XP-GWAS can enrich the frequency of rare alleles. XP-GWAS has been effective in identifying QTL in multiple crops, including quality characteristics in coffee22 and apples23, and alkyl cannabinoid in cannabis24. We performed an XP-GWAS of disease response of the USDA Citrullus collection to P. xanthii race 2 W using historical data. Our objectives were to compare the results of traditional GWAS19 to XP-GWAS using the same historical dataset and design KASP markers for marker-assisted breeding of powdery mildew resistance in watermelon.

Materials and methods

Bulked segregant analysis

Historical data for disease response to artificial inoculation with P. xanthii race 2 W7 for the cultivated species, C. lanatus (N = 1,095), and its sister species, Citrullus mucosospermus (N = 52) were obtained from the USDA Germplasm Resource Information Network at https://www.ars-grin.gov/. Disease severity (DS) ratings for both stem and leaf tissues were used to select 45–46 accessions for each of three bulks. In addition to a bulk from each extreme of the phenotypic distribution, the XP-GWAS method requires a random bulk with individuals chosen at random with respect to phenotype21. The random bulk is used as a control to account for background population allele frequencies. Accessions with a disease severity rating of less than 4 for stems and 5 for leaves did not have visible mycelium growth on these tissues7. We chose individuals for the resistant bulk that had an average disease severity score of less than 5 across tissues (Fig. 1; Supplementary table S1). Accessions with a mean disease severity score of less than 8.3 across tissues were chosen for the susceptible bulk (N = 46) to obtain an approximately equal number of accessions as the resistant bulk (N = 45) (Fig. 1; Supplementary table S1). A disease severity rating of 8 was given to plants with 50–70% mycelium growth on that tissue (leaf or stem) with large necrotic areas and a 9 for tissues fully covered with mycelium or for a dead plant7.

Fig. 1
figure 1

Histogram of disease severity on stems and leaves after inoculation of the USDA Citrullus collection with Podosphaera xanthii race 2 W7. Shaded boxes indicate the bulked accessions. The accession mean for disease severity across the collection is indicated by a dashed vertical line. Figure generated in R25.

Accessions for the random bulk were chosen using the sample function in R25. The tolerant and random bulks were comprised of 22 and 40 C. lanatus accessions, and 23 and 6 C. mucosospermus accessions, respectively (See Supplementary table S1 online). The susceptible bulk was composed entirely of C. lanatus accessions. All the tolerant accessions originated in Africa and most of the susceptible accessions were from Europe or Asia. Seeds for the bulked accessions were obtained from the USDA National Plant Germplasm System.

Whole-genome resequencing

DNA was extracted from young leaf tissue of a single plant per accession with a ChargeSwitch DNA Kit (Invitrogen, Waltham, MA). DNA was quantified with a Qubit Fluorometer (Invitrogen, Waltham, MA) and 10 ng of genomic DNA per accession was combined to form the three bulks. Bulked DNA was sent to the W.M. Kick Center at the University of Illinois (Urbana, IL) for library preparation and whole genome resequencing. Genomic libraries were prepared with a Hyper Library Construction Kit (Kapa Biosystems, Roche, Basel, Switzerland), quantified with qPCR and sequenced on a single lane of an Illumina Novaseq 6000 (Illumina, San Diego, CA) with a NovaSeq S2 Reagent Kit (Illumina, San Diego, CA). Paired-end reads (150 bp) were generated with 151 cycles from each end of the fragment. Reads were demultiplexed and adaptors trimmed with the bcl2fastq v2.20 Software (Illumina, San Diego, CA). Raw sequence files have been deposited in the NCBI Sequence Read Archive and are available under BioProject PRJNA1197509.

Variant calling and XP-GWAS

Duplicate reads were removed with a custom Perl script obtained from https://github.com/Sunhh/NGS_data_processing/blob/master/drop_dup_both_end.pl. Trimmomatic v0.3826 was used to remove low-quality reads with the following filtering criteria: SLIDINGWINDOW:4:20 LEADING:3 TRAILING:3 HEADCROP:10 MINLEN:40. The remaining reads were aligned to the ‘Charleston Gray’ watermelon reference genome v219 obtained from the CuGenDBv227,28 using BWA v0.5.929. Reads originating from a single fragment were tagged and assigned to read groups using Picard Toolkit v2.18.7 (http://broadinstitute.github.io/picard/). A sequence dictionary and index for the reference genome was created with Picard and Samtools v0.1.830, respectively. Variant calling followed the GATK Best Practices Workflow31 in GATK v3.632. Vcftools v0.1.1533 was used to retain bi-allelic SNPs with no missing data, a GQ score greater than 30 and the maximum read depth within one standard deviation of the mean read depth. Variant counts of the reference and alternative alleles for each pool were calculated with the CollectAllelicCounts function of GATK and used as input for XP-GWAS.

The R package, XP-GWAS21, was used to identify SNPs associated with tolerance to P. xanthii race 2 W with the depth filter set to 50. The package computes a likelihood ratio test statistic on the output of a generalized linear model for each SNP. The statistic was divided by the inflation factor λ34 to control for population structure. A false discovery rate (FDR) of 5% was applied for multiple testing correction35.

KASP marker development

Significant SNPs from XP-GWAS were submitted to LGC Genomics (Teddington, Middlesex, UK) for KASP by Design Services. Alleles and flanking regions were used to design an assay mix able to distinguish alleles with allele-specific, fluorophore-labelled primers (See Supplementary table S2 online). PCR cocktails consisted of 15 ng of sample DNA, 2.5 µL of 2× Master Mix (LGC Genomics, Teddington, Middlesex, UK), 0.07 µL of Primer Mix (LGC Genomics, Teddington, Middlesex, UK) and molecular grade water to bring the total reaction volume up to 5 µL. A touchdown PCR protocol was followed with an initial 15 min denaturing step at 94 °C, followed by ten touchdown cycles with a 20 s denaturing step at 94 °C and annealing temperatures starting at 61 °C and dropping by 0.6 °C each cycle. The protocol finished with twenty-six additional cycles of 94 °C for 20 s and 55 °C for 60 s. Allele discrimination was determined through quantification of fluorescence using a Stratagene Mx3005P qPCR Machine (Agilent Technologies, Santa Clara, CA) with MxPro v4.10 Software (Agilent Technologies, Santa Clara, CA). Association of the KASP markers with P. xanthii race 2 W tolerance was verified with an expanded set of accessions (N = 186; See Supplementary table S3 online) from the extremes of the distribution of the previously phenotyped USDA Citrullus collection7. Seeds of the accessions were provided by the USDA National Plant Germplasm System. Association between the markers and disease response was tested using analysis of variance (ANOVA) with the aov function36 in R.

Candidate genes

QTL intervals around the significant QTL extended to the base of the QTL peak in each direction. Functional annotation of the SNPs and genes in these regions were used to determine the most promising candidate genes for powdery mildew tolerance. SNPs were prioritized by annotation to include those located in promoter regions as they may affect transcription of a causal gene and missense/nonsense mutations as they would change protein structure. Genes encoding proteins previously implicated in disease resistance are the most promising candidate genes for this study. Genome annotation for the ‘Charleston Gray’ v2 genome19 was obtained from the CuGenDB27. ANNOVAR was used for functional annotation of the SNPs37.

Results

Extreme-phenotype genome-wide association study

Sequencing of the genomic libraries generated 92.7 to 99.1 million reads per bulk for a minimum of 33.2x genomic coverage. The variant calling pipeline identified 1,157,440 SNPs between the bulks. Filtering of the raw SNP calls resulted in 301,059 high-quality biallelic SNPs for analysis. Marker density varied across the genome, with a range of 19,548 to 59,943 SNPs per chromosome. Two SNPs were significantly (FDR < 0.05, with genomic control) associated with disease response to artificial inoculation with P. xanthii race 2 W (Fig. 2). The SNPs were adjacent (4,865,001 bp and 4,865,003 bp) on chromosome 7. There were two additional SNPs on chromosomes 2 and 4 that approached significance (FDR = 0.11 and 0.14, respectively) and were therefore included for marker design (Fig. 2).

Fig. 2
figure 2

Manhattan plot of the XP-GWAS results for disease response in the USDA Citrullus collection after inoculation with Podosphaera xanthii race 2 W7. The shaded box indicates the previously identified QTL associated with resistance to P. xanthii races 1 W17,19 and 2 W19. The horizontal dashed line indicates an FDR significance threshold of 0.05. Figure generated with the qqman package38 in R25.

Marker validation

KASP markers were designed for the four SNPs (S2_31304458, S4_1119489, S7_4865001 and S7_4865003) with the strongest signal from XP-GWAS and two adjacent SNPs up and downstream of each for a total of 16 markers (See Supplementary table S2 online). An expanded set of accessions (N = 186) from the same phenotyped collection were genotyped with the KASP markers to validate their association with powdery mildew tolerance (See Supplementary table S3 online). Two markers were monomorphic and removed from downstream analyses. Thirteen of the fourteen remaining markers were significantly (P < 0.05) associated with tolerance (Fig. 3; See Supplementary table S3 online).

Fig. 3
figure 3

Boxplots of genotypic effects of the most significant KASP markers developed for each QTL identified through XP-GWAS of disease response of the USDA Citrullus collection after inoculation with Podosphaera xanthii race 2 W7. Association between the markers and disease response was tested using analysis of variance (ANOVA) with the aov function36 in R25. Homozygous resistant is labelled ‘A’ and homozygous susceptible is ‘B’. The number of accessions with each allele at the SNPs were: SNP CS2_31304458 A = 42 and B = 130; SNP CS4_1119489 A = 26 and B = 158; and SNP CS7_4873496 A = 23 and B = 162. Figure generated in R25.

The SNPs with the strongest association on each chromosome were CS2_31304458 (rho = 0.26), CS4_1119489 (rho = 0.21), and CS7_4855905 (rho = 0.31) (Fig. 3). The pattern of marker association with disease severity was the same for both tissue types so allele effect plots are only shown for leaf ratings (Figs. 3 and 4). Two accessions of C. mucosospermus represent the most promising sources of donor material to use with these three markers for introgression of P. xanthii race 2 W resistance into elite C. lanatus lines. Accessions PI560020 and PI560005 had homozygous resistant genotypes at all three markers and disease severity scores indicating less than 20% chlorotic lesions on leaves (DS of 2.1 and 2.4, respectively) and less than three necrotic spots on their stems (DS of 1 and 1.4) (Supplementary table S3).

Fig. 4
figure 4

Boxplots of haplotype effects of the most significant KASP markers (as determined by ANOVA) developed for each QTL identified through XP-GWAS of disease response of the USDA Citrullus collection after inoculation with Podosphaera xanthii race 2 W7. Marker order in the genotype labels is S2_31304458, S4_1119489, and S7_4873496. Homozygous resistant is labelled ‘A’ and homozygous susceptible is ‘B’. The number accessions for each of the genotypes is: AAA = 9, AAB = 5, ABB = 27, BAA = 6, BAB = 2, BBA = 4, and BBB = 116. Figure generated in R25.

The additive effect of the tolerance allele for these markers ranged from 1.6 to 1.9 for the stem ratings and 1.4 to 1.6 for the leaf ratings. There was no evidence of enhanced tolerance in accessions with tolerance alleles at multiple loci (Fig. 4).

Functional annotation and candidate genes

The QTL intervals on chromosomes 2 (31,287,589 to 31,388,599 bp), 4 (1,014,968 to 1,123,501 bp), and 7 (4,799,532 to 4,880,782 bp) encompassed 9, 4, and 4 genes, respectively (See Supplementary table S4 online). Functional annotation of the 246 SNPs within the QTL intervals was: 1 exonic, 7 upstream, 5 downstream, 15 intronic, and 218 intergenic (See Supplementary table S5 online). The exonic SNP was a nonsynonymous change from an A to a G at 1,075,967 bp on chromosome 4 causing an amino acid change from phenylalanine to serine in gene ClCG04G000310.

Discussion

Genetic mapping studies of resistance to powdery mildew in watermelon have been limited to three previous studies, including a traditional GWAS using historical data for P. xanthii race 2 W19, a bi-parental QTL mapping study for race 1 W17, and a comparative genomic analysis for race 1 W18. Traditional GWAS of the same historical powdery mildew screening data7 utilized here identified 43 significant SNPs across 9 of the 11 chromosomes (not chromosomes 5 or 11) of watermelon19. Two QTL identified here, on chromosomes 2 and 4, collocate with those identified through traditional GWAS. The QTL on chromosome 7 was not significantly associated with powdery mildew resistance in the GWAS study. Insufficient marker density may have contributed to this false negative as the accessions in the GWAS were genotyped with low-coverage GBS (384-plex)19, while in the present study the bulks were genotyped with whole-genome resequencing. The flanking SNPs from the GBS dataset were 48 kb and 264 kb away from the significant SNPs (S7_4865001 and S7_4865003) identified in the present study. The additional GWAS associations that were not significant with XP-GWAS may be due to the reduced number of accessions, from 1,147 to 91. Fewer accessions can decrease the power to detect associations, particularly for rare alleles and/or the resistance alleles may not have been present in the bulked accessions21.

QTL mapping of P. xanthii race 1 W resistance in a bi-parental mapping population17 and comparative genomic analysis of a different resistance source for race 1 W18 identified a single major QTL on chromosome 2. The XP-GWAS signal on chromosome 2 for P. xanthii race 2 W resistance collocated with this QTL, indicating that this region may provide broad-spectrum tolerance to P. xanthii in watermelon. The source of resistance derived from PI 494,531 used in studies by Mandal et al. (2020)18 was also shown to be resistant to 11 isolates of powdery mildew collected from across the United States8.

There were no obvious candidate genes based upon SNP annotation or predicted gene function in the QTL interval on chromosome 2. The most significant SNPs were in an intergenic region between ClCG02G016820 and ClCG02G016830, which are predicted to encode a flowering locus K homology domain and an unknown protein, respectively. Overexpression of RLK-V in a susceptible wheat line improved resistance to powdery mildew (caused by Blumeria graminis f. sp. tritici) and silencing of this gene in a resistant line reduced the response to infection39. An RLK was the closest R gene (41 kb) to the resistance QTL on chromosome 2, however, none of the SNPs within this gene were significantly associated with resistance with XP-GWAS. There were two promising candidate genes for the QTL on chromosome 4 based upon gene functional annotation, including one that encodes a P-loop containing nucleoside triphosphate hydrolase superfamily protein (ClCG04G000310) and a polygalacturonase inhibitor-like protein (PGIP; ClCG04G000300). Mutational analysis of several plant R-genes found that a functional P-loop domain is necessary for plant defense signaling40. The non-synonymous mutation in this gene may cause reduced ability to initiate the plant defense signaling cascade causing higher disease severity. PGIPs are cell wall proteins that recognize the cell-wall degrading enzymes, polygalacturonases, critical for fungal invasion into host cells41. Exogenous application of methyl jasmonate in a grape vineyard caused a dramatic increase in the transcription of pathogenesis-related proteins, including a PGIP, leading to a 73% reduction in powdery mildew-infected leaf surface area in treated plants42. Four SNPs were located in the promoter region of ClCG04G000300 which could decrease expression of this gene, allowing more effective pathogen invasion. One promising candidate gene was found in this QTL interval. Peptidyl-prolyl cis-trans isomerases (PPIs) function in protein folding allowing them to regulate protein structure, activity and stability43. Some PPIs activate a structural change in pathogen effector proteins triggering the plant immune response44. One SNP was located in the promotor region of ClCG07G004160, which encodes a PPI. Decreased expression of this gene could cause a decreased immune response in powdery mildew susceptible accessions. These candidate genes represent the most promising targets for future gene editing studies to test their effects on powdery mildew disease response.

Public germplasm repositories have a wealth of historical phenotypic data associated with freely available accessions, much of which was collected prior to the genomics era. Whole-genome resequencing of bulked extremes of a diversity panel (XP-GWAS) offers a fast, effective, and relatively inexpensive mapping method. The ability to use historical data with this method will allow researchers with limited space and resources to pursue a variety of projects previously unattainable. We used XP-GWAS to successfully develop markers associated with tolerance to powdery mildew (P. xanthii race 2 W) using publicly available historical data for the USDA Citrullus collection. The KASP markers released here may be used, after validation, to incorporate powdery mildew tolerance into elite watermelon cultivars through marker-assisted selection.