Introduction

Mass spectrometry (MS) is an analytical tool in modern research that has advanced greatly since it was first invented by J.J. Thomson more than a century ago1,2. The technology itself has improved from the first-generation mass spectrometers that were used to separate the isotopes of uranium, through to liquid chromatography coupled tandem mass spectrometry (LC-MS/MS) now being used in day-to-day research activities3,4,5. Moreover, its application has evolved from fundamental chemistry to routine application in all aspects of the life sciences6,7,8,9,10. MS-based proteomics offers several modalities, including accurate quantitation of specific protein levels, the detection of post-translational modifications, and the detection of differentially expressed proteins. MS-based metabolomics and lipidomics have gained significant attention in the fields of newborn bloodspot screening (NBS) and biomarker discovery. Of note, we have seen the uptake of MS in rare disease research at unprecedented levels over the past decade.

Contrary to common perceptions, rare diseases are collectively not rare. Genetic conditions are a leading cause of death in children worldwide, with ~400 million individuals affected by rare diseases globally11. Most of these conditions have their onset in childhood, and one-third of those affected children will not survive until 5 years of age (https://globalgenes.org). The application of next-generation sequencing (NGS), exome or genome sequencing, has transformed the approach to genetic diagnosis for individuals with rare, presumed monogenic diseases. However, one of the major limitations is the difficulty in assessing the pathogenicity of the selected candidate variants. The situation is augmented by the growing integration of whole genome sequencing (WGS) into standard clinical practice, which has enabled the identification of additional variants detectable exclusively through WGS12. In many cases, these variants require further functional characterisation. MS, like many other laboratory-based methods (e.g. RNA sequencing or RNAseq), has received growing attention in the last two decades. This is due to its untargeted and high-throughput nature and ability to accurately identify the differences in the proteome of a patient compared to unaffected individuals.

MS has been used extensively in the rare disease research setting, including for biomarker discovery and protein profiling during functional genomics studies. More recently, it has started taking centre stage in providing additional evidence for the validation of variant pathogenicity in rare disease diagnosis. A recent study demonstrated an increased diagnostic yield using a multi-omics approach focused on mitochondrial diseases13, while another showed how proteomics can replace traditional clinical pathology testing for the detection of mitochondrial disorders14. However, a systematic validation of the utility of MS in the broader rare disease context is yet to be undertaken. This review of MS-based omics techniques focuses primarily on proteomics but also discusses some metabolomics and lipidomics studies. About three quarters of the reviewed articles were focused on proteomics, the reasons being: (1) proteins are the direct effectors of biological processes that are crucial for understanding disease mechanisms, and provide the most straightforward actional insights; (2) proteins are directly related to the structure and function of the cells and tissues, enabling the understanding of the complex molecular pathways disrupted in diseases; (3) the ability to detect low abundance proteins allowing the capture of a greater breadth of diagnostic information compared to metabolomics/lipidomics which usually identifies a narrower range of analytes; (4) proteomics is a more well-established approach compared to metabolomics and lipidomics in terms of conceptual and technical aspects of its delivery. We have also excluded epitope-based assays coupled with NGS, such as the proximity extension assay (PEA) known as Olink®15,16. This is due to it being a targeted rather than untargeted approach and relatively new technology with limited published use cases in the context of diagnostic genomics at the time of writing17,18. While Olink technology is powerful for profiling large numbers of samples with a dynamic range exceeding that of MS-based proteomics19, its targeted nature relies on the presence of epitopes for each protein. This may impact its suitability for the identification of proteins with missense or protein-truncating variants. Moreover, these assays use a panel of bead-antibody complexes rather than MS, restricting their use to specific sample types and the identification of known and usually well-characterised proteins.

Here, we have surveyed the published literature where MS was used in rare disease functional analyses to examine practical applications of the technology across common variant types and different modes of inheritance. Publications have been broadly categorised according to the reasons for using MS. Combinations of search terms, including 'rare disease', 'rare condition', 'diagnosis', 'mutation', 'proteomics', and 'mass spectrometry', were used to search PubMed and MEDLINE databases. A three-step workflow was adopted in this review (Fig. 1). A total of 267 papers were identified using these predefined search terms. These papers were then reviewed for their eligibility and relevance to the scope of this article. The final selection process led to the inclusion of relevant studies but excluded general reviews and monogenic germline cancer studies to produce this review paper.

Fig. 1: Flow chart of the review process.
figure 1

We searched two databases, PubMed and MEDLINE, to generate the list of articles for review. A combination of different search terms, including 'rare disease', 'rare condition', 'diagnosis', 'mutation', 'mass spectrometry' and 'proteomics' were used. A total of 267 publications from both databases were selected for further analysis. Thirteen publications were duplicated and, therefore, removed from the list. This left us with 254 publications for the next filtering step. We first screened the title of all 254 publications and excluded 60 that were either general review articles or publications that related to monogenic germline cancer studies. The resultant 194 publications were all research articles focusing on monogenic inherited diseases. They were all retrieved and assessed for their eligibility to be included in this manuscript. We then excluded 25 publications that were not MS-based omics studies. This left a total of 169 publications that were included in this final manuscript.

Orthogonal testing

While this is a great success that NGS can achieve diagnostic rates of up to 50%, nearly half the individuals with a suspected rare disease remain undiagnosed20,21. Receiving a definitive genetic answer is crucial for effective patient care and management and can potentially avoid further testing and lead to savings in healthcare costs22. Two major contributors to cases being undiagnosed are when true causative variants are either not selected during genomic data analysis or are recognised as potentially relevant but remain a variant of uncertain significance (VUS). The latter typically require functional evidence to support VUS pathogenicity. There are two barriers in generating functional evidence in a timely and efficient manner. These are the broad range of (mostly low throughput) potential functional assays and the large number of potential causative genes associated with monogenic conditions. This time-consuming and costly process could be alleviated by the availability of validated high-throughput, untargeted functional assays. An example of an emerging approach is RNAseq, which offers the advantage of characterising and identifying splicing defects on a transcriptome-wide scale in patients with suspected rare diseases23,24.

Similarly, in recent years MS-based proteomics has also contributed to the diagnosis of many mitochondrial-related conditions25,26,27,28,29,30. In most instances, filtered variants of interest are selected from the initial genomic NGS analysis, and MS was used as an orthogonal validation tool to assist with the interpretation of variant pathogenicity. Proteomic data can enable the examination of (1) the abundance of candidate proteins encoded by candidate genes, (2) complexes associated with candidate protein, and (3) interactors and binding partners of the candidate protein. A recent publication looking at relative changes to interactors of mitochondrial proteins has benchmarked MS-based proteomics against the traditional targeted analysis of respiratory chain enzyme assays for mitochondrial diseases14. Importantly, this study also demonstrated that MS-based proteomics can be deployed for real-time use in a diagnostic context (and is discussed in a later section of this review).

The MS approach has been exemplified by several independent studies on Leigh syndrome, a mitochondrial condition known to be associated with defects in 16 mitochondrial DNA genes and approximately 100 nuclear genes31. For example, potential causative variants were found in genes such as MRPS34, NDUFC2 and NDUFAF8. The negative impact of the variants on the abundance of both the candidate protein and interacting proteins was demonstrated in patient cell lines32,33,34. In one of these studies, MS-based proteomics also identified a complex I assembly defect through complexome profiling. This approach involves separating mitochondrial protein complexes by native polyacrylamide gel electrophoresis and analysing gel fractions to identify their individual protein components33. Combined with other lines of evidence, the results confirmed the pathogenicity of the selected variants.

The application of MS extends beyond mitochondrial disorders. Examples of its utility have been demonstrated in FHL1, EXOC2 and COL3A1 disorders, which impact processes such as signal transduction, vesicle trafficking and collagen synthesis35,36,37. MS-based proteomics identified FHL1 as a major component of intracytoplasmic inclusions, a pathogenic feature of FHL1-related myopathies. Subsequent genomic NGS identified the disease-causing variants. MS-based proteomics was also used to demonstrate the loss of the candidate EXOC2 protein carrying a pathogenic variant as well as other protein components of the exocyst complex in patient fibroblasts. Similarly, a study of the dysregulated extracellular matrix proteins associated with pathogenic variants in the COL3A1 gene also highlights the utility of MS-based proteomics in rare disease diagnosis. Moreover, a recent study also applied MS following ultra-rapid genomic sequencing of a patient with encephalopathic episodes found to have biallelic NUP214 variants38. One of the missense variants was already classified as pathogenic, while the other was a missense VUS. Quantitative MS-based proteomics confirmed the reduced level of the NUP214 protein. It also showed a reduction in NUP88, its physical interactor within the human nuclear pore complex, along with several other nuclear pore complex components. This allowed the reclassification of the VUS variant to likely pathogenic, thereby establishing the molecular diagnosis.

Disease gene discovery

Up to half of individuals with a rare disease remain undiagnosed despite exhaustive examination of clinically relevant gene panels restricted to known monogenic disorder genes. In these cases, unknown pathogenic variants may lie in a novel gene and go unnoticed during variant curation. MS-based proteomics has proven its worth in tackling this challenge. For example, a recent study presented two undiagnosed individuals with Leigh syndrome39, despite extensive genomic analysis that included clinical exome sequencing. One of them endured a diagnostic odyssey for more than two decades. Proteomics was performed on skin fibroblasts from both individuals and identified a significant reduction in the abundance of several proteins of the large mitoribosomal subunit, with MRPL39 protein showing the most pronounced alteration. Functional defects of mitochondrial complexes I, III and IV were also identified through relative complex abundance (RCA) analysis. This is a metric derived from proteomics data that quantifies co-dependence of subunit abundances in mitochondrial complexes, in both individuals. These proteomic findings instigated subsequent focused genomic and transcriptomic studies that confirmed the presence of pathogenic compound heterozygous variants in the novel mitoribosomal gene MRPL39. Other novel genes have also been linked to disease through the application of proteomic analysis, such as MRPL5040, MRPS3941, TEFM42 and HMGCS143. These examples highlight some of the most promising outcomes that MS-based proteomic studies can offer in the discovery of new disease genes.

Proteomics focused studies on disease models have also led to disease gene discovery. In one example, a comprehensive catalogue of human mitochondrial protein function was developed through MS-based multi-omics analysis of more than 200 knockout cell lines44. Among millions of distinct biomolecules measured, they discovered novel associations of the PYURF protein with both complex I assembly and coenzyme Q biosynthesis. This led to the identification of a homozygous pathogenic variant in a previously undiagnosed individual affected by a multisystemic mitochondrial disorder. In the same study, RAB5IF was also identified as a novel disease gene through the same approach where a causal relationship was identified between the gene and an individual with craniofacial dysmorphism, skeletal anomalies, and impaired intellectual development syndrome 2.

Alternative analysis techniques, for example, genomics, transcriptomics and traditional targeted assays have also been used widely for novel disease gene discovery. Both genomic and transcriptomic approaches have become well-established technologies in clinical settings45,46,47. They offer significant value for identifying genetic variants and elucidating their impact on gene expression. However, these approaches often fall short of providing the functional insights on protein stability and abundance provided by MS. On the other hand, traditional targeted assays like western blotting and ELISAs may be more practical for validation but lack the comprehensive depth, broad scope and sensitivity provided by MS-based proteomics.

Disease modelling

Understanding the interactors, binding partners, basic biological functions, and pathways of a protein can often provide valuable insights into the disease mechanisms associated with that protein. There are approximately 20,000 protein-coding genes according to the Human Genome Project48, implying that there should be a similar number of proteins in the human proteome49. However, this number could be up to several million if we consider various splicing events, tissue-specific isoforms and post-translational modifications50. Traditional methods for protein characterisation include western blotting, yeast two-hybrid and immunoprecipitation studies, with each having specific strengths and limitations. For example, SDS-PAGE and western blotting rely on the use of one antibody, which generally detects one protein. Therefore, the cost of performing such analyses rises steadily as the number of antibodies used increases. Moreover, the specificity of antibodies is also often not clear, with many commercial antibodies identifying multiple protein products51. Untargeted MS-based proteomics is a powerful tool in the field of biomedical research that has transformed the way we can study protein function and biology. The ability to identify thousands of proteins is a great advantage of MS-based proteomics over targeted and low-throughput protein-based approaches such as western blotting. In a report published in 2016, LC-MS/MS was successfully used to show cystinosin interacted with multiple components of the mTORC1 signalling pathway, for which there was a lack of reliable antibodies52. Another study highlighted the potential of MS-based proteomics in elucidating large multi-subunit complexes, thereby enabling the investigation of complex dysfunction at a cellular level using human knockout cell lines53.

The mechanisms underlying many rare diseases remain unclear to this day, requiring extensive research to unveil the disease aetiology. Over the past few decades, MS has been increasingly applied in functional studies to understand the molecular function of newly discovered genes53. This is often accomplished through model systems instead of patient samples, particularly when traditional diagnostic approaches have often been inadequate. A study using affinity purification of ectopically expressed POLR3B coupled to MS-based proteomics characterised the pathogenic impact of POLR3B variants on the protein interaction with other members of the RNA polymerase III complex54. Combined with knock-in mouse models, the researchers showed that the variants cause severe interruption of RNA polymerase III complex assembly, thus advancing our understanding of the disease mechanisms of this leukodystrophy.

Newborn bloodspot screening—first line of testing

Newborn bloodspot screening (NBS) was introduced in the 1960s, initially to detect phenylketonuria. Later developments allowed for the introduction of tests for conditions such as congenital hypothyroidism and cystic fibrosis. However, the paradigm for NBS was very much 'one test for one condition' until the late 1990s. Developments in MS, particularly the introduction of electrospray ionisation, which equally benefited MS-based proteomics, allowed the development of a single MS-based metabolomic test for multiple inborn errors of metabolism55. Since the 2000s, most NBS laboratories have introduced this technology56,57 and babies are now typically screened for 20–30 inborn errors of metabolism using a targeted panel of up to 50 metabolites, mainly amino acids and acylcarnitines. Previously, many of these disorders were diagnosed clinically, often when infants were extremely unwell and sometimes post-mortem. The early detection of these conditions has greatly improved outcomes. For example, medium-chain acyl-CoA dehydrogenase deficiency occurs at an overall birth prevalence of 5.3 per 100,000, with a higher prevalence for individuals with European ancestry58 and can be rapidly fatal in undiagnosed infants59. This condition is now readily detected as part of the NBS metabolomic panel resulting in a significant reduction in infant mortality60,61.

NBS testing is rather different to clinical testing: there is no phenotype information to guide result interpretation, and unlike clinical testing, the starting premise is that babies are assumed to be normal until there is convincing screening evidence that a baby has an NBS condition. False positive results cause considerable parental anxiety as well as unnecessary work for medical and pathology staff involved with confirmatory testing. Biomarkers, therefore, require high diagnostic sensitivity and specificity and should be limited to those specifically targeting the corresponding NBS conditions. Furthermore, the requirement to screen hundreds of babies per day with fast turn-around times and low assay failure rates places considerable demands on testing technology and laboratory organisation. Flow injection tandem mass spectrometry (FIA-MSMS) has been successful in meeting these demands, and metabolomic panels continue to be expanded to include new conditions such as X-linked adrenoleukodystrophy (X-ALD). Suboptimal biomarker diagnostic performance can be improved by using second-tier metabolomic tests that also use MS. For example, FIA-MSMS analysis of C26:0 lysophosphatidylcholine, the main biomarker for X-ALD, suffers from interferences which can cause false positives. Second-tier LC-MSMS analysis of these samples is effective at minimising interference and greatly improves NBS diagnostic performance62.

Biomarker screening

The ability to detect multiple proteins or metabolites concurrently, and the high-throughput nature of MS, brings another major application in the field of biomarker discovery. Friedreich ataxia (FRDA) is a rare disease caused by pathogenic variants in the FXN gene in almost all cases63. The rarity of the disorder and its relatively slow progression mean that any clinical trials will face the challenges of not hitting recruitment targets. MS-led biomarker studies can assist with these issues by the inclusion of proteome analysis in a smaller cohort over a shorter timeframe to assess treatment efficacy. A study designed to identify changes in cerebrospinal fluid (CSF) proteins occurring in FRDA patients exemplifies this approach64. This group identified 172 differentially expressed proteins (DEPs) in FRDA individuals compared to controls. Some top candidate proteins, such as SORCS3, SCG5, CNDP1 and NRXN2, can potentially be used as biomarkers to help with disease diagnosis or potentially as outcome measures of treatment response.

Identified biomarkers can be utilised to screen individuals with suspected conditions and confirm a diagnosis. Acid sphingomyelinase deficiency (ASMD) and Niemann-Pick type C disease (NPC) share some clinical features and are caused by pathogenic variants in the SMPD1 gene and the NPC1/NPC2 genes, respectively. Two research groups identified the differential biochemical profiles found in controls and individuals with ASMD or NPC using MS-based lipidomics65,66. They measured various lipid biomarkers and were able to show variable elevations of those markers in individuals affected by one or other of the two conditions. These results using MS showcase its potential in disease screening and its ability to help with making a precise diagnosis.

Numerous reports have been published on the use of MS in biomarker discovery in rare diseases using various types of patient samples67,68,69,70,71,72,73,74,75,76,77,78,79,80,81. One group used MS-based proteomics to examine the DEPs found in conjunctival impression cytology samples from individuals with congenital aniridia in order to discover potential biomarkers82. In another study, MS-based proteomics was used to identify potential drug targets by characterising downstream sequelae of CAPN5 hyperactivity in individuals affected with neovascular inflammatory vitreoretinopathy83. A recent study used targeted MS-based proteomics to identify protein signatures in a small cohort of individuals with systemic sclerosis78. Another study analysed differential protein profiles in serum and urine of mucopolysaccharidosis type I sufferers80. These studies highlight the growing attention and progress in this field.

Disease monitoring

Early diagnosis of rare diseases and monitoring of disease progression are likely the most effective clinical management to offer to patients, and MS plays an important role in this area. Tay-Sachs disease (TSD) is a rare lysosomal storage disease caused by pathogenic variants in the HEXA gene, which triggers the accumulation of GM2 ganglioside in lysosomes. There are no effective treatments for TSD, and the ability to diagnose the condition and monitor its progression early is particularly crucial to patient management. A group of researchers was able to demonstrate a significant accumulation of the GM2 sphingolipid in induced pluripotent stem cells from TSD individuals compared to controls using MS-based lipidomics84. The detection of GM2 build-up using MS enabled the early identification of TSD individuals.

A 6-month-old infant with complex I deficiency and biallelic pathogenic ACAD9 variants presented with severe cardiomyopathy and lactic acidosis, unresponsive to conventional riboflavin treatment85. In this case, alternative treatments of bezafibrate and nicotinamide riboside were administered. Disease progression was monitored through MS-based proteomics. This was achieved by measuring the level of mitochondrial complex I abundance in peripheral blood mononuclear cells (PBMC) before and during treatment. The study demonstrated the utility of MS-based proteomics as a tool for tracking disease progression and evaluating treatment efficacy of therapeutic interventions.

Finally, another group used MS-based lipidomics to measure phosphatidylcholine containing very long chain fatty acid signals in cerebrospinal fluid in a small cohort of patients with X-linked adrenoleukodystrophy86. The traditional method of measuring plasma levels has been reported to be insufficient for prediction of disease severity. This group found that MS could detect some signals in plasma that were typically detected only in the CSF of patients with the condition. Another example comes from a research group that studied Hutchinson-Gilford progeria syndrome87. Endogenous levels of lamin A and progerin were quantified using MS-based proteomics. It offered an alternative solution to traditional techniques that are often limited by their specificity and quantitative accuracy.

Efficacy of mass spectrometry in rare disease functional genomics

To examine the efficacy, specificity, and usage of MS using samples from patients with a range of rare diseases, a comprehensive review of the literature was conducted25,26,27,28,29,30,32,33,35,36,37,38,39,44,52,53,54,64,65,66,68,69,70,71,72,73,74,75,76,77,78,79,80,81,82,83,84,86,87,88,89,90,91,92,93,94,95,96,97,98,99,100,101,102,103,104,105,106,107,108,109,110,111,112,113,114,115,116,117,118,119,120,121,122,123,124,125,126,127,128,129,130,131,132,133,134,135,136,137,138,139,140,141,142,143,144,145,146,147,148,149,150,151,152,153,154,155,156,157,158,159,160,161,162,163,164,165,166,167,168,169,170,171,172,173,174,175,176,177,178,179,180,181,182,183,184,185,186,187,188,189,190,191,192,193,194,195,196,197,198,199,200,201,202,203,204,205,206,207,208,209,210,211,212,213,214,215. Several aspects were investigated in this review, including (1) the number of cases reported from 1993 to Jan 2024, (2) the distribution of inheritance patterns observed, (3) the types of variant zygosity reported, (4) the application of MS in the context of the studies, and (5) the variety of sample types used in MS analyses.

It is not surprising to see the uptake of MS in rare disease studies has significantly increased over the last decade (Fig. 2A). Over half of the literature reviewed featured disorders with an autosomal recessive inheritance, which is the most studied inheritance pattern within this review (Fig. 2B). This is not unexpected, given that loss of function is a common disease mechanism for most autosomal recessive conditions, and a reduction in protein abundance can potentially be measured by MS-based proteomics. The zygosity distribution of studied cases also fits with the inheritance pattern. Homozygous and compound heterozygous are two of the three most common types of zygosity reported (Fig. 2C).

Fig. 2: Inheritance patterns and zygosities of rare diseases studied using MS from the published literature.
figure 2

A Inheritance patterns of different rare diseases that have been studied using MS over time. B The overall contribution of each inheritance pattern in MS studies of rare diseases. C The contribution of each variant zygosity reported in all surveyed publications. AD autosomal dominant, AR autosomal recessive, XL X-linked, XLD X-linked dominant, XLR X-linked recessive, mito mitochondrial.

The variant type appears to be another important factor to consider when determining the utility of MS in rare disease studies (Table 1). Nearly 40% of missense variants are expected to cause reduced expression of the protein due to protein instability and subsequent degradation216. Missense variants are by far the most common type of pathogenic variant in many rare diseases, and one of the most challenging disease mechanisms for predicting the associated functional consequences140. Functional assays, such as RT-PCR or RNAseq, measure transcript abundance. However, they are of limited value for determining the functional impact of missense variants because RNA abundance, splicing or stability are typically not affected. In contrast, MS-based proteomics is a technique that has been shown to be able to detect changes in an individual’s proteome that may be caused by missense variants135,214. Almost half of the examined reports featured rare diseases caused by missense variants either in autosomal dominant or autosomal recessive conditions. However, missense variants can also lead to changes in metabolites or lipids without changes in protein abundance. In these scenarios, MS-based metabolomics or lipidomics focused on specific analytes would be more suitable to detect the consequences of the mutated protein.

Table 1 Variant type combinations reported in MS studies of rare diseases

The utility of MS in rare diseases included in this review is shown in Fig. 3A. The most common use of MS in rare disease research is for orthogonal validation of variants through proteome profiling. Almost one-third of the reviewed papers reported MS being used as an orthogonal tool to validate the pathogenicity of an identified variant in suspected rare diseases. Proteome profiling can facilitate the identification of potential drug targets or biomarkers, which may subsequently be leveraged for future disease screening and therapeutic development201. The majority of the reviewed articles are MS-based proteomic studies which cover various rare diseases, and almost all NBS articles are MS-based metabolomic or lipidomic studies (Fig. 3B).

Fig. 3: Utility of proteomics and sample types used for studying rare diseases using MS from the published literature.
figure 3

A Uses of MS reported in rare disease research and diagnosis. B Different types of MS-based omic technologies are included in this review. C Different sample types used in rare disease studies. PBMC peripheral blood mononuclear cells, CSF cerebrospinal fluid, FFPE formalin-fixed paraffin-embedded.

Disease-appropriate biological samples are important for establishing genotype-phenotype correlations, particularly given the tissue-specific expression of many proteins. Primary samples such as skin fibroblasts, muscle, brain, and CSF often involve invasive sample collection procedures. Historically, the most common type of material used for orthogonal testing has been patient-derived fibroblasts (Fig. 3C)217. However, one disadvantage of using fibroblasts is the time required to establish the cell line from a skin biopsy before functional tests can be performed. This makes fibroblasts not a suitable sample type for critically ill patients requiring a rapid diagnosis. Opting for other sample types, such as PBMC, can address this concern. This is supported by a recent report where ultra-rapid proteomics was carried out in under 54 hours for a critically ill infant with a suspected mitochondrial disorder. Proteomics provided functional evidence to support the pathogenicity of a homozygous intronic variant in NDUFS8 identified by ultra-rapid whole genome sequencing.

Clinical translation of MS-based proteomics

Despite numerous reports highlighting the utility of MS-based proteomics in the context of rare disease, this technique is currently restricted to research investigations. The translation of proteomics into clinical pathology laboratories would facilitate its incorporation into rare disease diagnosis and management. A recent study benchmarked MS-based proteomics for use in routine clinical practice for rare mitochondrial disorders14. The study demonstrated that a single untargeted proteomics assay outperformed traditional targeted respiratory chain enzymology, offering superior diagnostic and clinical insights. Further studies are needed to evaluate the feasibility of translating this technique into the broader spectrum of rare disease diagnosis. A recent micro-costing study highlighted the importance of prioritisation and reimbursement decisions for genomic technologies218. In addition, a micro-costing study focusing on MS-based proteomics indicated that proteomics can be performed at a similar cost to traditional functional assays in clinical laboratories219. Further studies of the health and economic implications of MS-based proteomics are now needed to support its introduction to routine clinical care.

While MS-based proteomics is a powerful tool for rare disease diagnosis, its translation into clinical settings holds several challenges. First, this assay requires initial setup with costly instruments, which require ongoing maintenance and dedicated staffing. Second, each experiment will generate vast amounts of data that are highly complex to interpret. Bioinformatic techniques are usually employed to deconvolute the data into a more transparent and standardised format for scientists to interpret. Moreover, the inherent complexity of biological systems and the diversity of sample types used as sources could introduce multiple challenges in standardising the acquired data across laboratories. Third, although MS-based proteomics has a high sensitivity for protein detection, it is still challenging to detect small proteins and low-abundant proteins. Lastly, and not unique to proteomics, tissue-specific expression of many genes means that the identification of some proteins is limited to certain sample types, making it challenging to investigate proteins in less accessible tissues.

There are no current regulations, accreditation systems (such as NATA, CLIA and ASQ), and/or standards that govern the translation of untargeted MS-based proteomics into the clinical setting. This means no consistent methods have been established relating to the acquisition, quantification and interpretation of the data collected, making it difficult to compare results between different laboratories. This also creates difficulties for clinicians aiming to use the results to guide clinical decision-making. The Human Proteome Organisation (HUPO) is a global scientific organisation dedicated to advancing proteomics through international collaboration. It promotes the development of new technologies, techniques, and training to advance the field of proteomics. In addition, international groups such as RD-Connect and the UDN network have teamed up clinicians and scientists to exchange data and explore possible causes of rare and undiagnosed disease using muti-omics. It is hoped that collaborative efforts from organisations like these will develop standardised approaches to the analysis of proteomic data, much like the ACMG have done with genomic sequencing interpretation guidelines. Standardised protocols will reduce discrepancies between different laboratories, alleviate difficulties in interpretation, and make this technology more accessible for delivery in the clinical setting.

Conclusions and future directions

In this review, we have examined the literature providing evidence for the utility of MS in rare diseases. It is evident that MS is a powerful tool in the field of rare disease research and diagnosis. We see a significant increase in the application of MS in the rare disease space, including in biomarker discovery, protein profiling in functional genomics studies, and various screening approaches of newborn and adult cohorts. Importantly, there has been a marked uptake in recent years of MS in providing functional insights into the pathogenicity of a given variant during diagnostic workup of rare diseases. There is no doubt that the importance of MS in rare disease research and diagnosis will continue to expand in the years ahead.

MS-based proteomics, though less integrated into clinical practice than targeted metabolomics, can help with the interpretation of VUSs in patients with existing genomic and clinical data. We hope to see this time-efficient and potentially cost-efficient approach increasing the diagnostic yield and improving the management of individuals affected by rare diseases. A single clinically validated high-throughput proteomics assay could revolutionise how many rare diseases are diagnosed, end the often lengthy diagnostic odyssey endured by many families, restore their reproductive confidence, and for some affected individuals, will potentially allow access to the growing range of targeted advanced genetic therapies.