Introduction

Pediatric sepsis is a life-threatening condition associated with organ failure in children due to a dysregulated host immune response to infection. It is a recognized global public health problem that affects 20.3 million children and causes 2.9 million deaths in those under five years old every year.1 Despite global efforts to improve clinical outcomes for pediatric sepsis, its phenotypic heterogeneity remains a significant barrier to therapeutic advancement.2 Several recent studies have advanced efforts to characterize pediatric sepsis phenotypes using data-driven approaches, including genome-wide expression profiling,3,4 dynamic modeling of organ dysfunction trajectories,5 and supervised classification of inflammation-based subtypes.6 While these contributions are important, key limitations remain. Studies, such as Wong et al.3 and Sweeney et al.4 primarily rely on transcriptomic data, which, although informative at the molecular level, may lack direct applicability at the bedside. While Sanchez-Pinto et al.5 utilized subscores of the pediatric Sequential Organ Failure Assessment (pSOFA) score to define multiple organ dysfunction syndrome (MODS) phenotypes, these phenotypes may not fully reflect the broader complexity and heterogeneity of pediatric sepsis, particularly features not directly manifested as organ failure. Although Carcillo et al.6 included a broader range of clinical features, their supervised approach introduces the potential for bias due to dependence on predefined categories or outcomes. In contrast, our approach aims to address these limitations by leveraging a broader and more granular set of clinical data in an unsupervised framework that emphasizes both statistical robustness and bedside applicability. Recently, by applying machine learning approaches to 25 first-day bedside clinical variables of 404 pediatric sepsis patients with organ dysfunction enrolled as part of a multicenter cohort, PHENOtyping sepsis-induced Multiple organ failure Study (PHENOMS), between 2015 to 2017, we derived four computable phenotypes PedSep-A, B, C, and D with differences in infection source, cytokine profiles, organ failure, outcomes, and treatment responses.7

Several studies suggested that host genetic factors contribute to the heterogeneity of pediatric sepsis. While early family studies8 and targeted candidate gene analyses9,10,11 have supported this notion, discovery efforts have been limited to discover novel functional sepsis-related genes. Alternatively, researchers have conducted several genome-wide association studies (GWAS) on adult and pediatric populations to identify common variants underlying sepsis susceptibility and outcomes.12,13,14 However, although common variants have been used to understand the genetic basis of clinical outcomes (e.g., survival from sepsis or hospital admissions), they are limited in elucidating the genetic architecture for computable pediatric sepsis phenotypes with poor outcomes. Since common variants usually have small effects on complex traits, their systematic study requires a prohibitively large sample size and their small effects are of limited clinical benefit in a large fraction of the population, which is not feasible to study in pediatric sepsis.15,16 Additionally, to the best of our knowledge, studies investigating rare variants (low penetrance, low allele frequency (AF)) in pediatric critical illness are limited,17 with no prior rare variant analyses specifically focusing on sepsis subtypes. This gap underscores the need for more targeted genomic investigations to identify genetic contributors to pediatric sepsis severity and heterogeneity. These findings underscore the need for more comprehensive and functionally focused genomic studies, especially in understudied pediatric populations. Without involving further functional validation, the large fraction of findings in non-coding regions challenges the interpretation of the sepsis GWAS results.

To address these limitations, we performed a gene-based exome-wide rare variant analysis using data from the PHENOMS study of severe pediatric sepsis. Using whole-exome sequencing data from 319 children (Fig. 1) with sepsis and organ dysfunction in the PHENOMS study, we focused our analysis on the highest-risk phenotype, PedSep-D, with the aim of identifying biologically impactful variants and informing future therapeutic strategies. Altogether, we present the first rare variant burden test in the computable phenotype with the worst outcomes in pediatric sepsis.

Fig. 1
figure 1

CONSORT (Consolidated Standards of Reporting Trials) diagram for the study.

Methods and Materials

Consent statement

The study was approved by the Institutional Review Board at University of Utah Central IRB # 70976.

Cohort and phenotyping

The phenotyping data and blood samples were obtained from pediatric sepsis patients of a multicenter cohort, PHENOMS.6 The cohort enrolled pediatric patients from 2015 to 2017 with written informed consent from at least one of the guardians. Children were qualified for enrollment if they met all four criteria: (1) at the ages of 44 weeks to 18 years old; (2) were suspected of infection meeting two or more SIRS (systemic inflammatory response) criteria;18 (3) presented one or more organ failures; and (4) had an indwelling arterial or central venous catheter. Patients without a commitment to aggressive care or lack of blood samples were further excluded from the enrollment.

The data-driven phenotyping approach and results were described in previous work.7 Briefly, four phenotypes named PedSep-A, B, C, and D were identified by applying consensus k-means clustering on 25 day-one bedside variables. As WES data was available for only a subset of the complete cohort, we confirmed the phenotyping analysis using this subset of patients with available genetic data to ensure that phenotype assignment was robust in this reduced sample size. In the following genetic analysis, we performed a case-control study between children in the computable phenotype subgroup with the highest severity of illness and mortality (PedSep-D) and the others (PedSepA,B, and C) to identify the genetic factors exclusively associated with increased sepsis severity susceptibility.

DNA extraction and genotyping

Out of 404 pediatric patients enrolled in the cohort, a total of 381 parents of the children provided WES consent, and 2 mL of whole blood was collected for DNA extraction using standard methods. Whole-exome sequencing was successfully completed on 332 patients from 2018 to 2020 by the University of Pittsburgh Genomics Research Core performed on the Ion Torrent platform. Libraries were constructed by the Ampliseq Exome RDY (Thermo Fisher Scientific) with 100 × target coverage. FASTQ files were aligned to Homo sapiens reference sequence GRCh37/hg19 to generate VCF files. Variant calling was performed by GATK (Genome Analysis Toolkit).19

Quality control

Two levels of quality control were conducted on 332 samples with completed whole-exome sequencing data, patient-level, and variant-level. At the patient level, we excluded nine individuals without phenotype information. Four pairs of individuals were identified as relatives based on IBD (identity by descent). In each IBD pair, the individual with the higher missingness was removed from the analysis. In terms of variant-level quality control, we filtered sites with SOR (Strand Odds Ratio) > 3, MQ (root mean square Mapping Quality) < 40, QD (variant confidence normalized by depth) < 2.0, average GQ (Genotyping confidence) < 20, average DP (Depth) < 10, missingness > 0.05, HWE (Hardy-Weinberg equilibrium p) < 1e-06, and those located in sex chromosomes. No imputation of missing genotypes was performed due to concerns for potentially low imputation quality of rare variants in datasets with small sample sizes. Quality control was performed by software bcftools (v1.9),20 VCFtools (v0.1.16),21 and PLINK (v1.9).22 Then variant function was annotated by ANNOVAR.23

Principal Component (PC) derivation

To account for potential population stratification and other confounders in the statistical model, we derived 10 principal components (PCs) based on common SNPs following linkage disequilibrium (LD) pruning. LD pruning was performed using PLINK (version 1.07) with the argument “–indep-pairwise 50 5 0.2”. This procedure involves considering a sliding window of 50 SNPs, calculating LD between each pair of SNPs within this window, and removing one SNP from any pair exhibiting an LD greater than 0.2. After pruning within a window, the window is shifted forward by 5 SNPs, and the pruning process is repeated until the entire dataset is processed.

Gene-based analysis

Variants that passed quality control were included if they were in hg19 annotated exon regions and had a MAF (minor AF) lower than 1%. Genes with less than three qualified variants were excluded from the analysis. The final number of genes tested was 3846. Therefore, the p-value threshold for declaring whole-exome level significance was 0.05/3846 = 1.3e-05.

Then, we aggregately examined the relationships between the rare variants and the binary indicator of phenotype membership by gene-based association test SKAT (Sequence Kernel Association Test).24 SKAT is a widely-employed method to test the association between a group of variants and the trait, which increases the power to detect rare variant associations by pooling rare variants across a given region of interest, such as chromosome region or gene. In running the SKAT test, a single null model was fitted containing only the covariates to be adjusted (i.e., age, sex, and the first four ancestry PCs constructed from common LD-pruned SNPs). Then the effect of SNPs from each gene was tested by variance-component score tests in a mixed model, and their statistics were aggregated with weights through a kernel matrix to form a gene-level statistic. Compared to other gene-based tests, such as the Burden Test and SKAT-O, one advantage of applying SKAT in our analysis is that it makes few assumptions about rare-variant effects and retains statistical power when variants within a gene have different directions and magnitude of effects.25 This property aligns with the study design that contrasts one phenotype with others and allows us to better account for potential heterogeneity in phenotypes.

Genes showing whole-exome level significance and suggestive significance were further investigated to query the gene function (GeneCards),26 common variant evidence from previous GWAS analysis (GWAS Catalog),27 gene enrichment in GO biological process (FUMA, Enrichr),28,29 and gene expression level in the GTEx database.30 Rare variants that contributed to gene significance were annotated with four different types of score (CADD,31 GERP,32 SIFT,33 Polyphen234) to indicate the effect of each variant.

Comparison of cytokine profiles between rare variant carriers and non-carriers

To further investigate the effect of variations on inflammation, levels of the 33 pre-collected biomarkers of the rare variant carriers were further visualized and compared with non-carriers. The cytokine heatmap was used to present the log ratio of the median biomarker values of the host response. The red color represents a greater value for the group compared to the entire cohort, while the blue color represents a lower value for the group compared to the entire cohort. Hierarchical clustering was used to visualize the similarity of cytokine patterns between rare variant carriers. Additionally, we calculated p-values from a pairwise t-test comparing cytokine values of rare variant carriers and non-carriers.

Mediation Analysis

To further explore potential biological mechanisms, we conducted a causal mediation analysis to evaluate whether biomarkers mediate the relationship between rare variant gene burden and PedSep-D membership. Mediation analysis requires three conditions to be met: 1) A significant association between gene burden and PedSep-D membership (which was established by SKAT results); 2) A significant association between gene burden and the candidate biomarker; and 3) A significant association between the biomarker and PedSep-D membership.

When all three conditions were satisfied, we used the mediation package in R,35 applying 1000 bootstrap iterations, to estimate the indirect effect (via the biomarker), direct effect, total effect, and proportion mediated for each gene–biomarker pair.

Sensitivity Analysis

In order to further validate the top genes identified from the gene-based analysis, we investigated the influence of the ancestry information on the genes with a sensitivity analysis that was performed as follows. Briefly, we randomly swap phenotype labels between pairs of individuals with the same reported ancestry information to keep the ancestry makeup of the groups the same and run 100,000 iterations of SKAT gene-based analysis to calculate how many times the test statistic value is greater than the test statistic value from observed data. Thus, we generated an empirical p-value for each gene.

Pathway-based analysis

In addition to identifying association signals between individual genes and the sepsis phenotype of interest, we also sought to identify associations between genetic variation at the pathway level, using Gene set analysis Association Using Sparse Signals (GAUSS).36 GAUSS is constructed with gene-based test results and calculates a gene set level p-value by identifying a subset of genes (i.e., core genes) to maximize the association signal. Using the GAUSS method, we aggregated the SKAT test statistics of individual genes into groups based on GO biological process (GOBP) pathway annotations and examined the associations between 7482 GOBP pathways and phenotype of interest. This was performed for the PedSep-D phenotype (Table 1). We also detected the active genes driving the pathway-trait associations to facilitate the interpretation of test results.

Table 1 GAUSS pathway-based association test results.

Results

PedSep-D phenotype has the highest mortality with unique clinical presentation and immune system profile

Using the 319 pediatric patients from the parent cohort who passed quality control both in bedside features and whole exome sequencing (WES) data (Fig. 1), we assigned them to one of four established phenotypes (PedSep-A, B, C, D) as previously determined by the consensus k-means clustering of 25 first-day bedside features.7 Among these 319 patients, the sample sizes of PedSep-A, B, C, D are 116 (36%), 86 (27%), 77 (24%), and 40 (13%), respectively (Table 2, Supplemental Tables 13). The proportions of patients in each phenotype are close to our original study, in which PedSep-A, B, C, D contained 34, 25, 27, and 14 percent of the 404 patients, respectively. To estimate the homogeneity of each phenotype, we projected them on the t-SNE plot of all 25 features and observed good separation between each phenotype and the other phenotypes (average Euclidean distance=6.4, Fig. 2). Specifically, the PedSep-D phenotype was distinct from PedSep-A, B, and C patients (Euclidean distance = 8.9).

Fig. 2: t-SNE plot of four phenotypes (N = 319).
figure 2

Samples in two-dimensional space were labeled by PedSep phenotype membership. Euclidean distance was measured between centroids of two groups to quantitatively compare the difference between one phenotype versus the remaining phenotypes.

Table 2 Demographic and day one clinical characteristics of PedSep-D and Non-PedSep-D patients.

Our data, which is sampled from the original PHENOMS cohort, confirmed distinct clinical (Table 2; Supplemental Tables 13) and biomarker (Supplemental Tables 47) profiles across phenotypes. PedSep-A showed the mildest presentation (Supplemental Tables 1 and 4) and PedSep-D had the most severe profile with multi-organ failure and highest mortality (Tables 24), where PedSep-B and -C are in bet-ween (Supplemental Tables 2 and 6). These patterns recapitulate the importance of analyzing PedSep-D as the highest-risk group (Supplemental Table 8).

Table 3 SKAT gene-based association test result for PedSep-D.
Table 4 Characteristics of rare variant carriers and non-carriers.

Gene-based test associates LTBP4, PLA2G4E, and CCDC157 with PedSep-D

To detect genetic factors associated with the sepsis phenotypes, we performed a whole exome-wide rare variant analysis. To increase power in detecting associations, we aggregated the rare variant association signals by gene and estimated the significance in the following steps (see Methods). First, we performed quality control and selected a total of 3864 genes that had more than three variants for the association between rare variants and the phenotype of interest. Then, we ran SKAT on the WES data separately for each of the four PedSep phenotypes (PedSep-A, B, C, D) versus any of the other three phenotypes while adjusting for age, sex, and ancestry based upon the first four PCs constructed based on common variants. (Fig. 3, Supplemental Figs. S13, Tables 2, 3, Supplemental Table 9).

Fig. 3: Manhattan plot for PedSep-D (No. of genes = 3846).
figure 3

Red line: whole-exome wide significant –log10(P) value; Blue line: suggested significant –log10(P) value.

While our primary analyses centered on PedSep-D, we included results from the other phenotype comparisons to provide broader context and to highlight the distinctiveness of PedSep-D-specific genetic associations. For PedSep-A or C versus the remaining phenotypes, no significantly associated genes were detected. For PedSep-B, the PLXNA2 gene presented a suggestive association (Supplemental Fig. S2). However, the QQ plot (S. Fig. 4B) showed genomic inflation, suggesting that the potential association of PLXNA2 has a high risk of being false positive. In contrast, for PedSep-D, variation in LTBP4 was significantly associated with phenotype development at the exome-wide level (p-value = 1.069E-05, 6 [15%] carriers in case group, 2 [0.7%] carriers in control group), while variations in PLA2G4E (4 [10%] carriers in case group, 2 [0.7%] carriers in control group) and CCDC157 (7 [17.5%] carriers in case group, 10 [3.6%] carriers in control group) were suggestively associated with phenotype development. Four, 4, and 8 rare variants contributed to the significance of LTBP4, PLA2G4E, and CCDC157, respectively (Table 5). All variants encode missense variants except one in the LTBP4 gene. However, this silent variant, rs370696272, replaces a common leucine code (CTG, 0.361) with a less common codon (TTG, 0.134) based on the CoCoPUT database,37 explaining its high CADD score (17.55). Most variants in three genes were predicted to be deleterious based on their CADD score (12 out of 16 with CADD > 10), among which SNP rs573310430 in LTBP4 had the highest CADD score of 34, ranked over the top 0.1% in terms of deleteriousness among variants across the whole genome. This variant creates an unpaired cysteine in the 14th calcium-binding epidermal growth factor-like (cbEGF) domain of LTBP4, a domain stabilized by three pairs of cysteines forming intradomain disulfide bonds particularly sensitive to removal or addition of cysteine residues.38

Table 5 Single variant association and functional prediction for variants contributing to the gene-level significance.

We observed well-calibrated test statistics and little evidence of inflation (Supplemental Fig. S4, lambda = 0.98) for PedSep-D, suggesting that these associations are true signals. To explore the AF of variants contributing to significant and suggestive genes, we compared the AF of all variants across three populations (Black, White, and Asian) in the gnomAD database (Supplemental Table 9). No large difference was observed between the AFs across populations, indicating that the top signals are less likely related to ancestry distinctions. To further investigate if there is an ancestry difference driving the top signals, we conducted a sensitivity analysis by randomly swapping the phenotype labels between pairs of individuals with similar ancestry information to keep the ancestry makeup of the groups the same while generating a meaningful empirical p-value. With 100,000 iterations of permutation for each of the three genes, we observed 0 times that permutated statistics were larger than the previously estimated statistic. This indicates the significance of the three genes is not likely driven by ancestry differences.

LTBP4, PLA2G4E, and CCDC157 underlie distinct cytokine patterns in patients

To explore the genes’ association with inflammation status, we grouped patients based on whether they carried rare variants in one of the three genes of interest, generated a heatmap showing the normalized levels of 33 cytokines (Fig. 4), and statistically tested the group-wise differences (Supplemental Tables 10, 11). Comparison between the rare variants carriers and non-carriers indicated some similarity shared by carriers groups. For example, a higher level of IL-6 is significantly related to both LTBP4 and CCDC157 rare variant carriers (p-value = 0.032 and 0.043, respectively), and higher level of M-CSF is significantly related to both PLA2G4E and CCDC157 rare variant carriers (p-value = 0.013 and 0.011 separately). Simultaneously, several cytokines important in regulating inflammation showed distinct patterns when comparing carriers to non-carriers with rare variants in LTBP4, PLA2G4E, and CCDC157 (Supplemental Table 13). For instance, IL-4 is significantly higher in LTBP4 rare variant carriers (p-value = 0.025) but is significantly lower in CCDC157 rare variant carriers (p-value = 0.035). Compared to non-carriers, PLA2G4E rare variant carriers presented significantly higher levels of IL-16, and SCF, but significantly lower levels of CRPH (p-value = 0.034, 0.005, and 0.022, separately), while LTBP4 and CCDC157 rare variant carrier groups showed no significant difference with non-carriers for these biomarkers. The ferritin level is uniquely higher in LTBP4 rare variant carriers (p-value = 0.023). These results imply that the three genes might be involved in different pathological mechanisms driving the phenotype.

Fig. 4: Biomarker heatmap of three genes’ carriers and non-carriers.
figure 4

The log ratio of the median values of 33 inflammatory biomarkers by rare variants carriers and non-carriers. Red represents a greater median biomarker value for that group compared with the median for the entire study cohort, whereas blue represents a lower median biomarker value compared with the median for the entire study cohort.

Mediation Analysis was further performed to evaluate whether biomarkers mediate the relationship between rare variant gene burden and PedSep-D membership. The results (Supplemental Table 12) highlight three gene–biomarker pairs that showed significant mediation effects. For example, the effect of LTBP4 burden on PedSep-D membership was partially mediated by ADAMTS13 levels, with 16% of the total effect explained by this biomarker. Similarly, IL-16 and IL-8 were significant mediators for PLA2G4E and CCDC157, respectively, with mediation proportions ranging from 16 to 41%. These findings suggest that biomarkers may serve as functional intermediates linking rare variant burden to sepsis subtype PedSep-D and highlight potential targets for mechanistic validation.

In addition, we accessed tissue-specific gene expression data from the GTEx database. Given that GTEx compiles transcriptomic profiles across a wide range of human tissues, we observed that all three genes are expressed in multiple tissue types (Fig. 5a). LTBP4 is highly expressed in multiple tissues, including the lung, kidney, stomach, skin, and others. PLA2G4E is specifically expressed in the skin. CCDC157 is specifically expressed in testis. In terms of cell-type specific expression of three genes, all of them displayed expression in immune cells of the cardiovascular and pulmonary systems, both of which are highly affected by severe sepsis or septic shock (Fig. 5b).

Fig. 5: Expression of three genes in different tissues and cell types.
figure 5

a Heat map of tissue-specific log2 transformed average expression level for three genes based on GTEx v8 RNA data. Red color indicates a higher expression within a tissue compared to other tissues, whereas blue color indicates a lower expression within a tissue compared to other tissues. b Dot plot of cell-type-specific expression level for three genes in two sepsis-related tissues (GTEx Single Cell data). The dot color reports the mean expression value with each cell type. The dot size reports the fraction of cells in which a gene is detected. Black dot size reports the total detected level of all three genes.

Discussion

In this study, rare variants in LTBP4 were significantly associated with the development of the previously reported high-mortality PedSep-D phenotype, with additional suggestions of associations with rare variations in PLA2G4E and CCDC157. To our knowledge, this is the first time a rare variant burden test has been applied to pediatric sepsis with deep phenotyping.

The top signal found in our study, LTBP4, a member of the latent transforming growth factor β binding protein family, shares structural homology with fibrillin and is moderately expressed in plasma cells and immune cells.39 Mutations in LTBP4 have been associated with autosomal recessive cutis laxa type 1C,40,41,42 Duchenne Muscular Dystrophy (DMD),43 fibrosis-related disorders,44 cancer,45 pulmonary disorders, and cardiovascular disorders.46 PedSep-D patients had the most severe kidney involvement, whereas LTBP4 was found to protect against tubular interstitial fibrosis by strengthening angiogenesis, downregulating inflammatory gene expression, and facilitating the maintenance of mitochondrial structure in tubular epithelial cells.47 Common variants in LTBP4 have previously been reported in GWASs to be associated with several traits, including lung function (FEV1/FVC),48 peak expiratory flow,49 hematocrit and hemoglobin,50 eosinophil counts,51 carotid intima-media thickness,52 and diastolic blood pressure.53 The precise molecular mechanism by which deleterious LTBP4 alleles may contribute to the PedSep-D phenotype awaits future functional studies. It remains to be determined whether known activities of LTBP4, elastic fiber organization and regulation transforming growth factor β (TGF-β) signaling54 or as yet undiscovered functions play a role in sepsis pathogenesis. TGF-β remains an attractive candidate given its potent anti-inflammatory effects.55 However, our finding that ADAMTS13 levels mediate part of the LTBP4 effect provide evidence for previously unknown molecular interactions between these proteins in sepsis. Because another ADAMTS protease family member, ADAMTS7 is known to cleave LTBP4,56 LTBP4 may serve either as a substrate or a competitive inhibitor of ADAMTS13. In the field of sepsis and trauma, Bergmann et al. have postulated the role of TGF-β and connected it, as well as other immunosuppressive cytokines, with the high mortality rate of patients discharged from ICU.55

As one of the two genes with suggestive significance, the PLA2G4E gene encodes a member of the cytosolic phospholipase A2 group IV family involved in membrane tubule-mediated transport regulation. It plays an important role in trafficking through the clathrin-independent endocytic pathway.57 PLA2G4E was also up-regulated in Alzheimer’s disease APP-PS1 transgenic mice lacking CD8 T cells compared to the control group.58 Common variants in PLA2G4E have been reported in previous GWASs to be associated with several sepsis clinical prognostic factors, such as neutrophil count,59,60 white blood cell count,61 and mean platelet volume.51

As another gene displaying suggestive significance, CCDC157 encodes a protein coiled-coil domain containing 157. Common variants in CCDC157 have been reported in previous GWASs to be associated with sepsis risk factors, such as hematocrit,50 pulse pressure,62 and calcium levels.59 No clear function of immune dysregulation has been reported for CCDC157 to date.

Although none of the pathways are significantly associated with PedSep-D phenotype, top-ranked pathways involve important biological processes related to sepsis development. For example, the high rank of N-acylphosphatidylethanolamine metabolic process supported previous investigations revealing evaluated fatty acids as candidate biomarkers of sepsis.63 The high rank of growth hormone (GH) secretion was supported by previous studies suggesting GH level is higher in septic shock patients compared to sepsis patients and is also higher in sepsis non-survivors.64 Other top pathways involve endocytic recycling that plays a role in infection-host interaction65 and regulation of cilium beat frequency which is related to respiratory disease and airway infection.66

There are several limitations in this study. First, the tested sample size is small, limiting the statistical power to detect associations. As such, larger independent cohorts are needed for validation and meta-analysis. Second, the signals from rare variants may be caused by local ancestry differences, in which situation the number of alleles derived from distinct ancestral populations at a given locus is different. Therefore, although we account for global ancestry by adjusting for top PCs in the association test and conducting sensitivity analysis, it is crucial to further perform local ancestry inference and examine the results in diverse populations separately to validate our findings. Third, we acknowledge the potential impact of residual LD on our association findings. Although we implemented LD pruning, it is challenging to fully resolve LD structure, particularly in regions with complex haplotypes or in diverse populations where reference panels may be limited. As such, some observed associations may reflect correlated signals rather than direct causal effects. Fourth, while our current analysis focused on individual variant and gene-level associations, future studies leveraging polygenic risk scores or burden scores that integrate both common and rare variants may offer a more continuous and potentially powerful framework for assessing genetic contributions to disease severity and subtype classification in sepsis. For example, Rautanen et al. identified genetic variants associated with 28-day survival in sepsis, providing a foundation for polygenic modeling of sepsis outcomes.12 The application of PRS in this context is an emerging area and will benefit from larger datasets and phenotype-specific GWAS to enable accurate score derivation. As a secondary analysis, we examined GTEx-based tissue-specific expression profiles to provide biological context for the potential regulatory function of the identified variants, particularly in tissues implicated in sepsis pathophysiology. However, we recognize that GTEx may not fully reflect gene expression dynamics in children with sepsis. To address this limitation and strengthen the biological relevance of our findings, future studies incorporating transcriptomic or proteomic data from pediatric sepsis patients and age-matched healthy controls will be essential for validation. Finally, given the nonsignificant findings from the GAUSS method, interpretation of the pathway analysis results should be approached with caution, and the results require further validation in a larger cohort. In summary, our study identified rare variants that, if found to have functional effects in future studies, might play a role in pediatric sepsis outcomes, providing evidence for a genetic contribution to disease heterogeneity.