Introduction

Cystinuria (OMIM 220100) is a most common inherited urinary stone disease found in 1–2% of overall kidney stones. In the general population, the prevalence of cystinuria was reported to be 1/7000 to 1/500001,2,3. Although the prevalence rate of cystinuria is low, the affected patients will suffer recurrent urinary stones from childhood and may undergo a series of surgeries, which will damage renal function and quality of life. It is essential to obtain the diagnosis at an early age and early stage of urinary stones.

Cystinuria is caused by a pathogenic variants in two genes; SLC3A1 (OMIM 104614) encodes an rBAT (a single transmembrane protein), and SLC7A9 (OMIM 604144) encodes b0,+AT (a twelve transmembrane protein)4,5,6,7,8. rBAT and b0,+AT form a dimer linked by a disulfide bond to form the heterodimeric structure of transporter. B0,+AT form the channel which transports dibasic amino acids (cystine, lysine, arginine, and ornithine) into the cell with the exchange of neutral amino acids. rBAT contains the multiple glycosylation sites at the extracellular domain. Interaction of b0,+AT is required for the rBAT to be fully glycosylated and properly folded and stably expressed at the apical membrane of proximal tubule4,5,6,7,8. We have reported the role of carboxyl-terminal of b0,+AT in controlling the trafficking of the rBAT-b0,+AT from the endoplasmic reticulum (ER) to Golgi complex and thus control the full glycosylation of the rBAT protein9.

Strologo et al. proposed the genome-based cystinuria classification10. The genotypes were classified, such as type A, due to two variants of SLC3A1; type B, due to two variants of SLC7A9; and type AB, with one variant on each of the above-mentioned genes. However, Rhodes et al. reported that over thirty percent of the cystinuria patients in the United Kingdom only possessed a single variant or even no variant, who will not fit into the genotypes previously proposed which will fit into the autosomal recessive inheritance11,12. Thus, it is of need to establish the genomic classification which will cover the unfit cases of genotype in cystinuria.

Previously, we have been studied Japanese cystinuria since 1980’s and in 2006, we identified the presence of a unique variant; p.(Pro482Leu), is located at the carboxyl terminus end of B0,+AT among Japanese cystinuria patients1,13,14,15,16,17. The p.(Pro482Leu) variant caused a severe functional defect of the cystine transporter, which was found in over 80% of Japanese cystinuria patients13. The high prevalence of the p.(Pro482Leu) variant had never been reported elsewhere. The prevalence of p.(Pro482Leu) variant among cystinuria patients was reported 1/8 in South Korea18, 1/73 in the United Kingdom12, while no report in China19. Because of a distinct genotype in Japanese cystinuria patients, a distinct clinical feature was also expected.

Our previous genetic study of cystinuria was based on exon lesion with sanger sequence. However, in recent years, a number of effects of the exon intron boundary on mRNA splicing have been reported20. And also, there may be a variant that could be missed by the sanger sequence. In order to precisely understand the genomic feature of Japanese cystinuria, here we studied the genomic characteristics of 101 Japanese cystinuria patients by next generation sequencing including exon–intron boundaries.

Results

Clinical features

In total, 101 patients diagnosed with cystinuria were identified comprising 36 (35.6%) women and 65 (64.4%) men. All patients were Japanese. A positive family history of cystinuria was documented in 29 (28.7%) of patients; 19 (18.8%) patients had only one generation affected (siblings), 10 (9.9%) patients had two generations affected. The median age at first presentation of stone symptom was 17 years old (range = 0–58 years old) with a median age of 17.0 years old in the male and 16.5 years old in the female. The proportion of onset age for less than ten years, teenage, the 20 s and 30 s, and over were 38%, 24%, 25%, and 13%, respectively (Table 1).

Table 1 Demographic and clinical data.

Genetic analysis

Genetic analysis was performed in all 101patients. At least two distinct genetic variants were detected in 93 patients (53 patients were homozygote and 40 patients were compound heterozygote), while 8 patients only had a single variant. All the patients had at least one variant. Variants in SLC3A1 were identified in 18 patients, and variants in SLC7A9 were identified in 88 patients. Variants in both SLC3A1 and SLC7A9 were identified in 5 patients.

Overall, 51 distinct variants were identified in SLC3A1 and SLC7A9 (Tables 2 and 3). Among 22 variants identified in SLC3A1, 13 variants were unreported (novel) variant (Table 2). Among 29 variants identified in SLC7A9, 12 variants were unreported (novel) variants (Table 3). In SLC3A1, 13 missense variants were identified, followed by four frameshifts and two splice-site variants, and one nonsense variant (Fig. 1A). The most frequent variant in SLC3A1 was p.(Val83Ala) (c.5487T > C), which was found in 3 patients (3.0%), followed by exon 10 deletion and p.Asn442fs. (c.1323dupT), which were found in 2 patients (2.0%) (Fig. 1A, Table 2). In SLC7A9, 24 missense variants were identified, followed by three splice site and frameshift variants. One nonsense and one initial codon variant were also found (Fig. 1B).

Table 2 List of variants in SLC3A1.
Table 3 List of variants in SLC7A9.
Fig. 1
figure 1

Distribution of variant in SLC3A1 and SCL7A9. In SLC3A1, 13 missense variants were identified, followed by 4 frameshifts and 2 splice-site variants, and 1 nonsense variant. The most frequent variant in SLC3A1 was p.(Val183Ala) (c.5487T > C), followed by Exon 10 deletion and p.Asn442fs. (c.1323dupT). In SLC7A9, 20 missense variants were identified, followed by 4 splice site and 2 frameshift variants were identified. One nonsense and 1 initial codon variant were also found. The most frequent variant in SLC7A9 was p.(Pro482Leu) (c.1445C > T), which was found in 73 patients (43 homozygous and 30 heterozygous). Red column indicated novel variant.

The most frequent variant in SLC7A9 was p.(Pro482Leu) (c.1445C > T), which was found in 73 patients (72.7%)(43 homozygous and 30 heterozygous), followed by p.Val340fs (c.1017delA), which was found in 9 patients (8.9%)(1 homozygous and eight heterozygous) and p.(Asn227Asp) (c.679A > G), which was found in 4 patients (4.0%)(4 heterozygous) (Fig. 1B, Table 3).

Location of variants

Regarding the location of the variants, all the variants in SLC3A1 were located at the extracellular domain except for p.Ala95Thr (c.313A > G) and p.Ile105Val (c.313A > G), which were located in the transmembrane domain of rBAT (Fig. 2A). For the SLC7A9, the most common variant, p.(Pro482Leu), was located at the carboxyl terminus, while p.(Met1Thr) (c.2T > C) was located in the N-terminus of b0,+AT. Other variants were located at either transmembrane domain (14 variants), cytoplasmic loop (8 variants), or extracellular loop (5 variants) (Fig. 2B).

Fig. 2
figure 2

Location of variants in rBAT (SLC3A1) and b0,+AT (SLC7A9) in exon and protein domain. Schematic diagram of rBAT and b0,+AT. rBAT protein has a single transmembrane domain (Blue) with a long extracellular domain with cytoplasmic domain encoded by ten exons (A). b0,+AT protein has 12 transmembrane domains with a cytoplasmic domain at N-terminus and C-terminus. p.(Pro482Leu) variant is located at the C-terminus end (B).

Genome-phenotype association

Amount of urine cystine for the patients who have variant in SLC3A1, SCL7A9 and SLC3A1/SLC7A9 were 1357.15, 1815.45 and 1434 µ mol/day, respectively (Table 4). Regarding genotype, 12 patients (11.9%) were type AA and 76 patients (75.2%) were type BB, while 1 patient (1.0%) was type AAB, 4 patients (4.0%) were type ABB, 1 patient (1.0%) was type A and 7 patients (6.9%) were type B (Fig. 3A).

Table 4 Aminoacid concentration based on Genes, Genotype and p.(Pro482Leu) variant.
Fig. 3
figure 3

Distribution of cystinuria genotype. Distribution of genotype represented predominance of type BB followed by type AA and type B (A). Urine cystine based on the genotype (B). Age of onset based on the genotype (C)

Genotype–phenotype association

Amount of urine cystine for type AA, BB, AAB/BBA, A and B were 1230.6, 1815.45, 1434, 3034.6 and 1968.8 µ mol/day, respectively. No significant difference in the amount of urine cystine was observed between genotypes (Fig. 3B, Table 4). Age of onset for type AA, BB, AAB/BBA, and B were 13, 16, 27, and 13.5, respectively. No significant difference in the age of onset was observed between genotypes (Fig. 3C).

Genotype–phenotype association based on p.(Pro482Leu) variant in SLC7A9

Regarding p.(Pro482Leu) variant in SLC7A9, 43 patients (42.6%) had homozygote variants, 26 (25.7%) patients had compound heterozygote variants, four patients (4.0%) had single heterozygote variant and 28 patients (27.7%) had no p.(Pro482Leu) variant (Fig. 4A). Amount of urine cystine for the patients of homozygote P482L variant, compound heterozygote p.(Pro482Leu) variant, single p.(Pro482Leu) variant and no p.(Pro482Leu) variant were 1705.2, 2359.3, 1280.8, and 1483.7 µ mol/day, respectively. No significant difference in the amount of urine cystine was observed between p.(Pro482Leu) based genotypes (Fig. 4B, Table 4). Age of onset for type p.(Pro482Leu) homo, compound p.(Pro482Leu) hetero, none p.(Pro482Leu) were 20,10 and 16, respectively. No significant difference in the age of onset was observed based on the p.(Pro482Leu) variant classification (Fig. 4C).

Fig. 4
figure 4

Distribution of p.(Pro482Leu) variant. Distribution of p.(Pro482Leu) variant represented predominance of homozygote variant (n = 43) followed by compound heterozygote variant (n = 26) (A). Urine cystine based on p.(Pro482Leu) variant (B). Age of onset based on P482L variant (C).

Variants in exon–intron boundary

Overall, six exon–intron boundary variants were identified in six patients (Table 5). Among two patients without any variant in exon, one patient had homozygote variant (c.1224 + 3A > C) (classified as type BB), and the other patient had heterozygote variant (c.1399 + 4_1399 + 7delAGTA) in exon–intron boundary (classified as type B) (Table 5). All six patients who possessed exon–intron boundary variant resulted in the reclassification of the genotype (Table 5). The patients who had intron–exon boundary variants showed a relatively higher amount of cystine (3214.8 µmol/day) compared to those of patients who had exon variants (1705.4 µmol/day) (Table 6).

Table 5 List of cases with Exon–Intron boundary variants.
Table 6 Urine amino acid concentration based on Exon/Exon–intron boundary variants.

In the case with homozygote c.1224 + 3A > C variant, we have studied the mRNA expression of the SLC3A1 and SLC7A9. The male patient with the family history of urinary stone had first symptoms of kidney stone at the age of 3. At the age of 28, the patient consulted the hospital due to left back pain. Computed tomographic scanning showed left staghorn calculus (Fig. 5A). Based on genomic analysis, the location of the variant was three nucleotides at the boundary of exon 11 (Fig. 5B). No variant in exon was identified. The patient had a high amount of urinary cystine (3311.6 µmol/day) along with a high amount of lysine, ornithine, and arginine (Fig. 5C). mRNA expression of the renal tissue showed a significant loss of SLC7A9 expression compared to that of SLC3A1 by RNA sequence (Fig. 5D). Based on the genotype cover exon and exon–intron boundary, the case that did not fit the genotype of autosomal recessive inheritance reduced to 9 cases from 14 cases out of 101 cases (Fig. 5E).

Fig. 5
figure 5

The case with homozygote exon–intron boundary variant. The CT image of the patient with left staghorn calculus (A). The case possessed c.1224 + 3A > C homozygote variant in SLC7A9, which was located 3 bases into the intron from the exon-intron boundary (B). 24 h urine amino acids of the patient (C). mRNA expression of the SCL3A1 and SLC7A9 genes based on the RNA sequence (D). FPKM represent RNA expression based on the number of fragments per kilobase of exon per million reads mapped. Based on the genotype cover exon and exon-intron boundary, the cases that do not fit into the autosomal recessive inheritance category reduced to 9 cases from 14 cases (E).

Cases with a single variant

Eight cases had only a single variant: seven in the exon region and one in the intron region. The median cystine concentration of the case with a single variant was 2501.7 µmol/day, which was above the median cystine concentration of whole cohort of 1745.7 µmol/day (Table S1).

Roots of cystinuria patients

We also study the origin of cystinuria patients. 77% of patients are from the Kanto area (central region), where Tokyo and Chiba prefectures are located. 8% of patients are from the Kansai area (middle west), where Osaka prefecture is located. 7% of patients are from the Tokai area (middle south), where Aichi prefecture is located. No patient from the Tohoku area (north) was identified in this study (Fig. S1A). Related to the percent of p.(Pro482Leu) variant, 100% of patients possessed p.(Pro482Leu) variant from Hokuriku (middle north), Kyusyu (far west), and Shikoku area (south west island). 76% and 57% of patients possessed p.(Pro482Leu) variant from Kanto (Tokyo and surrounding area) and Tokai area (middle south), respectively, while 37.5% from the Kansai area (middle west). No patients from Hokkaido area (northern island) possessed p.(Pro482Leu) variant (Fig. S1 6B).

Discussion

Cystinuria is the most common genetic related kidney stones that is responsible for 1% of kidney stones21,22. Since Japan is geographically isolated by the ocean and even conducted a national isolation policy in the past during the Edo era (in 1636–1854), the development of genetic characteristics can be different from the rest of the countries.

Among Japanese cystinuria patients, 73 out of 101 patients possessed p.(Pro482Leu) variant. Based on the common genome database (genome AD https://gnomad.broadinstitute.org/), the distribution of p.(Pro482Leu) variant in SLC7A9 among the global population is rare. p.(Pro482Leu) was not found in African, Latino, Ashkenazi Jewish, and European (Finish). Even for Asians, the allele frequency of p.(Pro482Leu) for south Asian and East Asian are 0.00006533 and 0.00005012, respectively. Among the previous Asian study of cystinuria, only 1 out of 8 cystinuria patients possessed p.(Pro482Leu) (heterozygote) in south Korean23, while no p.(Pro482Leu) was identified in Chinese19. The p.(Pro482Leu) variant is located at the carboxyl terminus (C-terminus) end of b0,+AT, which causes severe transporter defect comparative to those of frameshift and stop codon variant13. Our previous study identified the “VPP” motif at the carboxyl terminus of b0,+AT regulate the ER-Golgi trafficking and thus control glycosylation of rBAT protein9. The last P in the “VPP” motif represented the location of p.(Pro482Leu) variant.

Related to the genotype classification, the distribution of type BB was found in 76%, while type AA was only found in 12% of patients in the current Japanese study. In the UK study, type AA is dominant (36%), followed by type BB (26%). In the Chinese Study, 37.5% (3 out of 8) were type AA, and 25% (2 out of 8) were type BB. In a Korean study, 50% (4 out of 8) of patients were type AA, and only one patient (12.5%) was type BB. These data indicated the global trends of SLC3A1 variant (type AA) in European and even Asian cystinuria patients. Thus, the predominance of SLC7A9 variant (type BB) genotype found in Japanese cystinuria patients seems to be unique characteristics. Beside SLC3A1 and SLC7A9, novel cystine transporter, sodium-independent aspartate/glutamate transporter 1 (AGT1, SLC7A13) was identified24. However, so far, no pathogenic variants were reported25.

In a UK study, patients who have a variant in SLC3A1 (type A or AA) demonstrated a significantly lower level of lysine, arginine, and ornithine, but not cystine compared to those of other patients12. Another study from the UK presented no difference in the clinical parameter between type AA and type BB. Furthermore, patients with a single mutated allele demonstrated similar disease severity compared to those of 2 mutated alleles11. In the current study, no difference in the severity of cystinuria, such as urine cystine and age of onset, were observed based on the gene of variant (between SLC3A1 and SLC7A9), genotype (between type AA and type BB), and the number of mutated alleles. The difference in the clinical phenotype may be derived from a wide difference in the basal genome variants, in which we found 13 novel variants in SLC3A1 and 12 novel variants in SLC7A9 (total of 25 novel variants).

It has been a missing piece of cystinuria that up to 30% of patients did not fit into autosomal recessive inheritance category by exon variants11,12, which may be partially explained by variants in intron. In the current study, six exon–intron variants were identified, which resulted in the reclassification of the genotype. In the case with homozygote c.1224 + 3A > C variant in SLC7A9, almost complete loss of SLC7A9 mRNA expression was observed by the exon–intron boundary variant. This data is, we believe, the first direct evidence of the significance of exon–intron boundary variant, which resulted in the loss of the transporter expression. Based on the genotype cover exon and exon–intron boundary, the cases that do not fit into the autosomal recessive inheritance category reduced to 9 cases from 14 cases.

When looking at the origin of cystinuria patients among the Japanese population, most of the patients were found from the middle (Kanto and Tokai) and west (Kansai) side of Japan. No patient was found on the northeast side of Japan (Tohoku Area) except one patient in Hokkaido (Northern Island). The current Japanese population is proposed to be the result of admixture between the early migrants (Jomon people) and later migrants (Yayoi people). Jomon people lived in the Northern and Southern parts of Japan, while Yayoi people lived in the central part of Japan. Few cystinuria patients from the northeast side of Japan may indicate Japanese cystinuria patients’ roots may be derived from Yayoi people, rather than Jomon people26. We are currently studying the association of Japanese cystinuria patients with Yayoi or Jomon people based on the genome-wide SNP data, especially related to the p.(Pro482Leu) variant in SLC7A9.

The current manuscript contained several limitations. First, the data was obtained from Chiba university hospital with the collaboration with the hospitals from 15 different prefectures ranged from the northern part (Hokkaido) and the west side (Kyusyu) of Japan. Although data is obtained from a wide range of locations, our data missed some of the location in Japan. Second, the data is obtained in a retrospective manner. Third, presence of splicing error induced by the exon–intron boundary variants has not fully been validated except for one case presented. We are currently assessing the presence of splicing error due to the variants in exon–intron boundaries by in vitro transcription. Fourth, copy number variant (CNV) analysis was not performed in this study. We are currently preparing CNV analysis in Japanese cohort, thus we will report the result in the following paper.

In conclusion, the current manuscript described the genomic and clinical signature of Japanese cystinuria patients, which is distinct from not only European cystinuria patients but also from Asian cystinuria patients. Genotype classification may be updated based on not only exon but also exon–intron boundary variants.

Methods

Patient information

Patients recruited to this study all had a clinical diagnosis of cystinuria on the basis of clinical diagnosis of cystine stones from the southwest to northeast part of Japan between 2000 and 2020. Detailed clinical data were retrospectively collected. Demographics (age, sex and ethnicity) and age at the first stone events, treatment history, and urine biochemistry were collected.

Definition of urine amino acids

Cystine, lysine, arginine, and ornithine were measured by twenty-four hours urine sample. The amount of urine amino acids was determined by high performance liquid chromatography (HPLC) at SRL, Inc. (Hachioji, Tokyo, Japan).

Statistical analysis

Mann–Whitney’s U test, ANOVA analysis, and the χ2 test were used for comparisons of two or three groups. Spearman’s rank correlation coefficient was used to analyze the relationship between the two groups. Statistical computations were performed using JMP 11.0.0 (SAS Institute, NC, USA). P < 0.05 was considered significant.

Genetic analysis

Genomic DNA was extracted from whole blood using the Blood & Cell Culture DNA Midi Kit (Qiagen, Hilden, Germany). Genetic variants of SLC3A1 (NM_000341.3) and SLC7A9 (NM_001126335.1) were analyzed by next-generation sequence of the protein-coding exons and their intron boundaries essentially as described previously27. In brief, libraries for the next generation sequence were made by a KAPA Hyper Plus Library Kit. Hybridization capture was made by predesigned hybridization capture probes for SLC3A1 and SLC7A9 purchased from IDT. The libraries were sequenced on an Illumina NextSeq500. Variant calling was done using GATKv4 (https://gatk.broadinstitute.org/hc/en-us). The variants thus detected by the next-generation sequencing were again confirmed by Sanger sequencing. The next-generation sequencing analysis was conducted at the Kazusa DNA Research Institute (Kisarazu, Chiba, Japan). Unreported (novel) variant was defined as the variant not reported on the previous publication. These gene variants are described with reference to the sequence of NM_000341.3 for the SLC3A1 and NM_001126335.1 for the SLC7A9.

RNA sequencing

The kidney tissue was obtained by biopsy. Total RNA from kidney biopsy sample was extracted using the RNeasy Mini Kit (Qiagen, Hilden, Germany). RNA sequencing libraries were prepared using a SureSelect Strand-Specific RNA Library Prep Kit (Agilent Technologies, Inc., Santa Clara, CA, USA). Sequencing was performed on a HiSeq 2500 system (Illumina, Santiago, CA, USA) in a 50-base single-end mode. TopHat (version 2.1.0; with default parameters) was used to map to the human reference genome (hg38). Then, gene expression levels were quantified using Cufflinks version 2.2.1; with default parameters) and are expressed as the number of fragments per kilobase of exon per million reads mapped (FPKM).