Introduction

Rice (Oryza sativa L.) originated in tropical and subtropical regions, but it can grow in a wide range of latitudes and is even cultivated at latitudes above 40°N1. The adaptation of rice to different latitudes has been studied by examining flowering-related genes because latitude affects daylength2,3. The adaptation of crops to new latitudes requires the retuning of flowering time to maximize yield potential. In potato (Solanum tuberosum), soybean (Glycine max), and maize (Zea mays), flowering genes were identified that improved adaptation to the limited summer growing season at higher latitudes during plant domestication4,5,6,7. Unlike soybean, which originated in temperate regions and adapted to tropical regions, rice originated in tropical or southern regions and has adapted to northern regions, i.e., from low/middle to high latitudes7. To date, studies of the adaptation of rice to high latitudes have focused on flowering genes associated with ecotype8, and few other genes associated with this adaptation have been explored9. Therefore, more research is needed to understand the effects of latitudinal differences in temperature on rice.

Lipids play critical roles in signaling, energy storage, and membrane formation10. As lipids are essential membrane components in organs, tissues, and cells, they directly affect plant interactions with the environment11. The lipid-related Gly-Asp-Ser-Leu (GDSL) esterase/lipase proteins (GELPs) are essential enzymes in various physiological and developmental processes12 (Table 1). These proteins have both esterase and lipase activities, function in the hydrolysis of ester bonds, and can act on lipids13. GELP proteins function in lipid metabolism, including the breakdown of lipids and the release of fatty acids12. GELPs play key roles in plant responses to biotic and abiotic stress13. Rice contains 115 OsGELP genes that function in multiple processes, such as pollen exine development (OsGELP110 and 115)14,15; cell wall formation (OsGELP33)16 and OsGELP62 or DARX1)17; lipid homeostasis (OsGELP34)14,18,19; disease resistance20; and secondary metabolism, such as pseudocholinesterase (OsGELP91) and acetylcholinesterase (OsGELP92) metabolism21. In addition, some GDSL LIPASE/ESTERASE genes are related to disease immunity (OsGLIP1, 2, and 78)22,23 and plant responses to water loss (OsGELP112 or WDL1)24 and UV-B radiation (OsGLIP1)25. Many GELP genes are closely associated with plant responses to abiotic stress26,27,28. However, the roles of GELP genes in environmental and regional adaptability in rice have not been investigated. The key to resolving this issue is to link the haplotypes of each gene to environmental and regional adaptability.

Table 1 GELP genes in rice whose functions were identified in previous studies.

In this study, we explored how rice adapts to a wide range of latitudes. Since lipids play crucial roles in various physiological processes in plants, including membrane structure, energy storage, and signaling, we focused on specific haplotypes of OsGELP genes in rice accessions grown in high-latitude regions and explored their roles in the adaptation of rice to these conditions. We also examined the organ-specific expression patterns of these genes to explore how different rice organs adapt to high latitudes. Our findings shed light on rice varieties with haplotypes of target OsGELP genes that have played a role in their adaptation, which could contribute to the development of region-specific rice varieties that thrive in harsh environmental conditions.

Results

Selection of OsGELP genes carrying HLHs

We identified 115 potential OsGELP genes from both RAP-DB and the list assembled by Chepyshko et al. (2012). These genes are located across the 12 rice chromosomes, with several present in gene clusters generated by GGT.229 (Figure S1, Supplementary Data 1). We used 115 OsGELP genes collected from the SNP-seek database of the 3000 RG data (https://snpseek.irri.org/snp.zul) (Fig. 1). All data, such as latitude, subpopulation, SNPs, and country of origin, were obtained from the SNP-seek database. Each of the 115 OsGELP genes had diverse haplotypes, with the number of haplotypes ranging from 2 to 17 and an average of approximately 5.8 haplotypes per gene. Haplotypes present in more than 30 accessions were considered for further analysis30. We considered China (39.9°N), Japan (35.70°N), Korea (37.55°N), the United States (38.9°N), and European countries (40°N to 60°N) to be high-latitude countries. Many haplotypes of the OsGELP genes were detected under specific geographical conditions characterized by different average latitudes (Supplementary Data 2). For GLA, the average latitude was calculated based on the latitude of the capital city of the country where the haplotype was present.

Fig. 1
figure 1

Workflow of general and specific analyses of identification 115 OsGELP genes related to latitude across 3000 RG.

The average latitudes of the haplotypes of all OsGELP genes ranged from 6.58°N to 37.16°N (Supplementary Data 2). Comparing the distribution between genes with or without HLHs in the 3000 RG data, HLHs were found in varieties in high-latitude regions. OsGELP74 lacks an HLH; its haplotypes are found in latitudes from 15.13°N to 25.23°N, representing the optimal latitude for rice growth (Figures S2A and S2C). All haplotypes of OsGELP74 are widely and randomly distributed in the 3000 RG data across the latitudes. Conversely, OsGELP65 has two HLHs, with different haplotypes showing different distributions (Figures S2B and S2D). Varieties carrying either HLH hap_184 or hap_369 in OsGELP65 are present almost exclusively in high-latitude regions (Figures S2B and S2D), indicating that both HLHs are specific high-latitude genotypes that might be responsible for adaptation to high latitudes. Among the 115 OsGELPs, 12 genes had haplotypes predominant at latitudes above 35°N: OsGELP4, 18, 19, 42, 58, 60, 64, 65, 66, 90, 104, and 107 (Fig. 2, Table 2).

Fig. 2
figure 2

GLA in 12 OsGELP genes. A. OsGELP4; B. OsGELP18; C. OsGELP19; D. OsGELP42; E. OsGELP58; F. OsGELP60; G. OsGELP64; H. OsGELP65; I. OsGELP66; J. OsGELP90; K. OsGELP104; L. OsGELP107. Blue dots, Japanese accessions; red dots, Indonesian accessions. Different letters indicate significant differences based on Duncan’s test (p < 0.05). Haplotype identified in more than 30 accessions from the 3000 rice genomes were considered for analysis.

Table 2 Ten selected OsGELP genes containing high latitude haplotypes evaluated by GLA and SLA.

Selection of HLHs for SLA

A limitation of GLA is that the exact latitudes of the rice varieties were not reflected by the latitudes of the capital cities in countries with a wide range of latitudes from north to south, such as Japan, China, and the United States. To make the latitude data for haplotypes as accurate as possible for the place of origin where a variety was created, we conducted SLA using specific latitude data for two countries: Indonesia and Japan. For SLA, we employed 70 Indonesian and 30 Japanese accessions. None of the 70 Indonesian accessions contained HLHs of OsGELP genes or latitude-specific haplotypes, whereas the 30 Japanese accessions had HLHs for most of the 12 genes selected by GLA (Fig. 2, Supplementary Data 3 and 4). Specifically, SLA revealed that in the Japanese accessions, OsGELP18, 19, 42, 58, 60, 64, 66, 90, and 107 each had one HLH, and OsGELP65 had two HLHs: hap_184 and hap_369 (Fig. 3). SLA identified 11 HLHs for the 10 genes that were also identified by GLA (hereafter “selected genes”), but two genes, OsGELP4 and OsGELP104, did not have HLHs in the Japanese accessions (Fig. 3A and K, Table 2).

Fig. 3
figure 3

SLA in ten HLH genes. A. OsGELP4; B. OsGELP18; C. OsGELP19; D. OsGELP42; E. OsGELP58; F. OsGELP60; G. OsGELP64; H. OsGELP65; I. OsGELP66; J. OsGELP90; K. OsGELP104; L. OsGELP107. Blue dots, Japanese accessions; red dots, Indonesian accessions. The y axis shows the latitude average in each haplotype. Different letters indicate significant differences based on Duncan’s test (p < 0.05). Haplotypes identified in more than 30 accessions from the 3000 rice genomes were considered for analysis.

Confirmation of HLHs in high-latitude accessions

Owing to its wide range of latitudes (22°N to 43°N), Japan has a rich variety of rice accessions in each prefecture. To assess whether HLHs are primarily found in the northern part of Japan, we focused on Hokkaido, which has the lowest ambient temperatures in Japan. We examined the distributions of HLHs of 10 selected OsGELP genes using 22 Hokkaido varieties. Each variety carried HLHs for 4 to 9 of the 10 OsGELP genes. Among the 10 genes with HLHs, HLHs of OsGELP42 and OsGELP58 were found in 50% of the 22 Hokkaido varieties, representing the lowest proportion. By contrast, the HLH of OsGELP60 was present in all 22 varieties (Fig. 4). These HLHs of the 10 selected genes were confirmed to be present only in japonica and temperate japonica rice subpopulations but not in indica (Figure S3). Fisher’s test validated that the HLHs were significant components in the japonica and temperate japonica rice subpopulations. These results suggest that these haplotypes contribute to the adaptation of high-latitude rice varieties to northern regions.

Fig. 4
figure 4

Presence of HLH in ten HLH genes in Hokkaido rice varieties. Grey box shows the HLH and number in x and y axis are the total number of HLH.

Haplotype networks of genes with HLHs

GLA and SLA selected 11 HLHs for 10 genes: OsGELP18, 19, 42, 58, 60, 64, 65, 66, 90, and 107 (Figs. 2 and 3). We created haplotype networks for each of these genes to illustrate the relationships among haplotypes, their origins, and the population size of each haplotype (Fig. 5). The largest circles in each gene represent main haplotypes found primarily at mid-latitudes, averaging between 18°N and 25°N. HLHs for seven of the selected genes (OsGELP18, 19, 42, 58, 64, 65, and 90) were directly derived from the main haplotype (Figs. 5A, B, C, D, F, G and I ), while the remaining three genes (OsGELP60, 66, and 107) had HLHs branching from minor haplotypes of the network (Figs. 5E, H , and J). The genetic variations of these HLHs could have facilitated adaptation to different environmental conditions at specific latitudes. Among the proteins encoded by the first seven genes with HLHs, a few amino acid changes might have occurred that rapidly enhanced adaptation to high latitudes. OsGELP65 has two HLHs, hap_184 and hap_369, which are closely positioned in the haplotype network (Fig. 5G). This proximity points to the recent diversification and common evolutionary origin of these haplotypes (Fig. 5). Their proximity suggests that they either provide similar functional benefits or were selected under comparable environmental pressures, which might be related to the functions of GELPs in plants.

Fig. 5
figure 5

Haplotype networks of ten HLH genes. A. OsGELP18; B. OsGELP19; C. OsGELP42; D. OsGELP58; E. OsGELP60; F. OsGELP64; G. OsGELP65; H. OsGELP66; I. OsGELP90; and J. OsGELP107. Haplotypes are represented by circles, where the size of each circle show the proportion to the frequency of the corresponding haplotype. Lines on connecting branches represent mutation in SNP. Each circle represents a haplotype, with the size proportional to the number of samples sharing in the haplotypes (circle scale 10 to 1 samples or accession). Colors within each circle correspond to different rice subpopulations, as indicated in the legend (e.g., admix, aus, indica, japonica subgroups, subtropical, temperate, tropical). Red dashed boxes highlight haplotypes associated with cold tolerance (HLH).

Phylogenetic analysis and amino acid changes related to haplotypes

We performed phylogenetic analysis of GELP genes in rice and other plant species. The OsGELP genes clustered with other GDSL genes from diverse plant species, suggesting that they might be evolutionarily conserved and play similar roles among plants. Neighboring genes from other species (e.g., AtFXG1 from Arabidopsis thaliana, ZmAChE from maize, and BnSCE3 from Brassica napus) form distinct clades, reflecting functional or evolutionary divergence (Figure S4). Among the 10 selected genes, 4 pairs of OsGELP genes (OsGELP18 and 19, OsGELP42 and 90, OsGELP64 and 65, and OsGELP60 and 66) were identified as paralogs (Figure S4).

We analyzed amino acid changes resulting from SNP variations in each haplotype of these paralogous genes. As shown in Figs. 6, the mutations observed between the major haplotype sequences and the HLHs have occurred in diverse amino acid positions. For example, the amino acid changes between OsGELP19 and OsGELP18 occurred in different motifs (Figure S5). Despite these mutations, we identified two conserved motifs across the 10 selected genes (Figure S5). Mutations or amino acid changes within these genes often occurred in distinct motifs or positions, suggesting that the mutation independently occurred in each HLH.

Fig. 6
figure 6

Comparisons of nucleotide and amino acid variations in the HLH genes OsGELP18, OsGELP19, and OsGELP64. A. Haplotype variation of OsGELP18 in 22 Hokkaido varieties and three additional cultivars. Yellow rows indicate HLH varieties. SNP positions on Chr. 1 are: [1 = 26,200,500; 2 = 26,200,740; 3 = 26,200,744; 4 = 26,200,829; 5 = 26,200,859; 6 = 26,200,980; 7 = 26,201,066]. Haplotype 7 contains a “TA” insertion at position 26,201,067 relative to the Nipponbare reference (“nf”: not found). Among the 22 Hokkaido varieties, 18 carried HLH-type haplotypes. A polymorphic amino acid site at position 291 shows glutamic acid (E) in haplotypes 3 and 7 (non-HLH, red arrow, lower-latitude accessions) and lysine (K) in haplotype 26 and Nipponbare (HLH, blue arrow, higher-latitude accessions). B. Haplotype variation of OsGELP19 in 22 Hokkaido varieties and six additional cultivars. Yellow rows indicate HLH varieties. SNP positions on Chr. 1 are: [1 = 26,203,092; 2 = 26,203,969; 3 = 26,204,143; 4 = 26,204,442; 5 = 26,205,695; 6 = 26,205,789; 7 = 26,206,196]. Haplotype 83 contains a “CACGTTCGCT” insertion at position 26,206,197 relative to the Nipponbare reference. Among the 22 Hokkaido varieties, 17 carried HLH-type haplotypes. A polymorphic amino acid site at position 414 shows phenylalanine (F) in haplotypes 811 and 7 (non-HLH, red arrow, lower latitudes) and leucine (L) in haplotype 646 and Nipponbare (HLH, blue arrow, higher latitudes). C. Haplotype variation of OsGELP64 in 22 Hokkaido varieties and six additional cultivars. Yellow rows indicate HLH varieties. SNP positions on Chr. 5 are: [1 = 6,825,409; 2 = 6,825,625; 3 = 6,825,647; 4 = 6,825,677; 5 = 6,825,698; 6 = 6,825,714; 7 = 6,825,793; 8 = 6,825,867; 9 = 6,826,118; 10 = 6,826,745; 11 = 6,826,769; 12 = 6,827,210]. Among the 22 Hokkaido varieties, 17 carried HLH-type haplotypes. D. Amino acid variation in OsGELP64 haplotypes. Polymorphic sites are located at position 122 (alanine [A] in HLH haplotype 200 and Nipponbare vs valine [V] in non-HLH haplotypes 1, 81, 261, 72, and 167) and position 326 (histidine [H] in HLH haplotype 200 and Nipponbare vs arginine [R] in non-HLH haplotypes). In addition, haplotype 167 carries threonine (T) at position 415, while others have isoleucine (I). Red arrows indicate non-HLH haplotypes, and blue arrows indicate HLH haplotypes. E. Predicted protein 3D structural differences between HLH-type haplotype 200 and non-HLH-type haplotype 81 of OsGELP64 (AlphaFold DB). The polymorphism at amino acid position 122 (A↔V) alters the arrangement of α-helices (spirals) and β-sheets (green). Numbers in parentheses indicate the average latitude of each haplotype.

Notably, SNPs between the main haplotypes and HLHs found in OsGELP18 and OsGELP19 resulted in amino acid substitutions with the similar structural property. This is illustrated by the paralogous genes OsGELP18 and OsGELP19, which were found to be homologous based on their encoded amino acid sequences (Figs. 6A and B). Haplotype analysis of 22 Hokkaido rice varieties revealed similarities between these two genes, as shown in the haplotype grouping. The rice varieties Akage, Hayayuki, Hokkaido, Nourin 15, and Wasefukoku exhibited haplotypes distinct from the 17 other varieties. This was further confirmed by the presence of a 2-bp insertion in OsGELP18 at genomic position Chr. 1 (26,200,736–26,200,737) and a 10-bp insertion in OsGELP19 at genomic position Chr. 1 (26,203,718–26,203,727) (Figs. 6A and B). Changes in SNPs in the exons of OsGELP18 and OsGELP19 led to single amino acid substitutions. In OsGELP18, the amino acid glutamic acid (E) in hap_7 and hap_3 was replaced with lysine (K) in HLH (hap_26). Similarly, in OsGELP19, phenylalanine (F) in hap_7 and hap_811 was replaced with leucine (L) in HLH (hap_646). These single amino acid changes are expected to result in altered H-bonding (Fig. 6).

In addition to haplotypes encoding single amino acid substitutions, OsGELP64 encodes a protein with five amino acid differences across six haplotypes. For example, HLH (hap_200) encodes a protein in which amino acid position 255 contains alanine (A), whereas it contains valine (V) in proteins encoded by other haplotypes. This change occurred within the alpha-helix region (Figs. 6C and D). Mutations in alpha-helical regions tend to be more robust than those in beta-strands, as helices tolerate more sequence variation without disrupting secondary structure. This robustness is primarily due to the higher number of interacting residues in helices than in strands or coil regions31. Furthermore, OsGELP58 (Figure S6) and OsGELP66 (Figure S7) exhibited amino acid changes resulting in differences in H-bonding between the major haplotypes and HLH. We also observed SNP changes in intron regions in genes such as OsGELP65 (Figure S8).

Organ-specific OsGELP gene expression

We classified 105 OsGELP genes based on their organ-specific expression patterns (Supplementary Data 5) by constructing a heatmap of their normalized relative gene expression levels (normalized signal intensity [log2]) in leaf, root, stem, and anther tissue (data were not available for OsGELP13, 28, 36, 39, 41, 48, 57, 60, 104, or 115) (Figures S9 and S10). We generated the gene expression profiles from RiceXPro data using a single microarray platform with probes based on manually curated gene models in RAP-DP and full-length rice cDNA sequence information in the KOME database. The 105 OsGELP genes showed various levels of expression in leaf blades, leaf sheaths, roots, stems, and anthers (Supplementary Data 5). We detected significant differences in expression among twelve organs based on the average expression level in each organ, as determined by Duncan’s test (Figure S11). Overall, these genes were expressed at the highest levels in roots and stems, with several genes showing peak expression in these organs. OsGELP18 and OsGELP19 showed notably high expression in roots, while OsGELP58 and OsGELP66 exhibited elevated expression in stems (Figure S11A).

In Figure S11B, the organ with the highest expression level for each gene is highlighted in blue. For instance, OsGELP18 was expressed at the highest levels in roots, while OsGELP19 expression peaked in stems. These patterns suggest that OsGELP genes might have specialized functions depending on the organ in which they are most highly expressed. The overall expression profiles point to functional specialization of OsGELP genes in rice. The roots and stems were the primary sites of expression for many of these genes, suggesting they might function in structural support, nutrient uptake, or other root- and stem-specific processes. These results indicate which organs might require adaptation at high latitudes. At the same time, these genes might be indispensable for the development of the corresponding organs. Genes with HLHs that are expressed in roots and stems might strongly contribute to the adaptation of rice to high latitudes.

Discussion

Latitude-dependent diversification of OsGELP genes in rice

We identified 11 high-latitude haplotypes (HLHs) in 10 OsGELP genes—OsGELP18, 19, 42, 58, 60, 64, 65, 66, 90, and 107—in a large-scale haplotype analysis using the 3000 Rice Genomes dataset (Fig. 2 and Supplementary Data 2). Although latitude-associated haplotypes have previously been reported in rice32, this is, to our knowledge, the first study to reveal a large set of latitude-specific haplotypes within a single gene family, particularly one associated with lipid metabolism. The strong representation of HLHs in the temperate japonica subpopulation implies that OsGELP genes contribute to the cold-adaptive metabolic features of rice grown at higher latitudes. Given the role of GELPs in lipid degradation and remodeling26, their diversification may represent a molecular basis for plasticity under cold stress conditions. To validate the association between HLHs and geographic distribution, we compared accessions from Indonesia (low-latitude origin) and Japan (high-latitude origin). Consistent with environmental adaptation, accessions from Japan—particularly Hokkaido—harbored unique HLHs not present in tropical varieties. These results reinforce the hypothesis that HLHs represent adaptive alleles that support survival in stressful, high-latitude environments.

Root and stem expression supports functional adaptation to cold

Transcriptome analyses revealed that four of the ten HLH-containing OsGELP genes were predominantly expressed in roots and two in stems. Both organs are critical interfaces with abiotic stress: roots encounter low soil temperatures directly, while stems experience ambient air conditions. The stem’s exposure to fluctuating temperatures may require specific lipid-mediated protective mechanisms, yet its role in cold adaptation has been underexplored. In rice, which is of tropical origin, root and stem development are highly sensitive to temperatures below 20 °C. Cold-sensitive cultivars show inhibited root hair formation and reduced root biomass under stress33. In contrast, cold-tolerant lines maintain better root morphology, possibly due to specialized lipid composition. Notably, OsGELP64 and OsGELP65, highly expressed in roots (Figure S11), are homologous to AmGDSH1, a carboxylesterase known to function in root-based herbicide metabolism34. This suggests a conserved root-specific detoxification and stress response role for these GELPs. The strong root-specific expression of HLH-containing OsGELP genes implies that cold adaptation in rice involves remodeling of lipid metabolism in underground tissues. This finding aligns with reports from other species (e.g., wheat) in which cold stress alters lipid accumulation profiles35.

OsGELP genes represent novel candidates for high-latitude adaptation

Most previously identified cold-tolerance genes in rice, including qCTB7, bZIP73, and Ctb1, lack HLHs in our analysis, indicating that traditional cold-tolerance loci alone do not account for latitude-associated adaptation (Table 3, Figure S12 and S13). Among 14 known cold-tolerance genes, only OsMPK3, LTG1, and OsAPX1 exhibited HLHs (Table 3, Figure S12 and S13). Likewise, only Ghd7 and DHT8 among seven flowering-related genes analyzed had HLHs (Table 4 and Figure S14). This contrast highlights the underappreciated role of lipid-related genes, particularly OsGELPs, in high-latitude adaptation. All ten HLH-associated OsGELPs displayed amino acid substitutions in key domains, some of which alter predicted hydrogen bonding interactions. Such changes may impact protein–lipid interactions, especially for membrane-bound enzymes36. Enhanced hydrogen bonding stability could enable these proteins to maintain function under cold stress.

Table 3 Classification of the known cold tolerance genes based on the presence or absence of HLHs.
Table 4 Classification of the known photoperiodic flowering genes based on the presence or absence of HLHs.

Latitude-driven environmental variables such as temperature and photoperiod influence membrane lipid composition. Rice grown at higher latitudes tends to accumulate more unsaturated fatty acids to maintain membrane fluidity in colder climates37. Thus, we propose that HLH-associated OsGELP genes contribute to the remodeling of lipid metabolism to stabilize membranes under high-latitude environmental conditions.

Conclusion

Our findings reveal that OsGELP genes, particularly the ten harboring HLHs, are promising candidates for understanding and improving high-latitude adaptation in rice. Their distinct haplotypes, expression in stress-sensitive organs, and amino acid modifications point to key roles in lipid-mediated cold tolerance. These genes may offer novel targets for breeding climate-resilient rice varieties.

Materials and methods

Genome-wide identification of OsGELP genes in rice

Among the 115 OsGELP genes in the rice genome, the chromosome positions of 113 OsGELP genes were obtained from RAP-DB (https://rapdb.dna.affrc.go.jp); the sequences of OsGELP36 and OsGELP73 were unavailable, as they were not present in RAP-DB (Supplementary Data 1). All data, such as average latitude and haplotype (Supplementary Data 2), subpopulation (Supplementary Data 6), single-nucleotide polymorphisms (SNPs), and country of origin, were obtained from the IRRI SNP-seek Database (https://snpseek.irri.org). Using the SNP-seek data, haplotypes were defined based on SNPs in each OsGELP gene sequence.

General and specific analyses were performed. Global latitude analysis (GLA) used general latitude data or the location of the capital city in each country using the 3000 Rice Genomes (RG) data. Specific latitude analysis (SLA) used specific latitude data from the prefecture of origin of the Indonesian and Japanese accessions (Fig. 1). The genes were classified into those that were either latitude dependent or independent. The latitude-dependent genes were divided into two groups: genes with high-latitude haplotypes (HLHs), predominant at an average latitude > 35°N, and genes lacking HLHs. Latitude-independent haplotypes were found in accessions originating at a latitude ranging from 0° to > 40°N. High-latitude countries included China (39.9°N), Japan (35.70°N), Korea (37.55°N), the United States (38.9°N), and those in Europe (40°N to 60°N). A schematic representation of our methodology is shown in Fig. 1.

Extraction of haplotype and latitude data via GLA

As described above, GLA was performed using general latitude data or the location of the capital city in each country using 3000 RG data. Haplotypes present in more than 30 accessions were considered for analysis30. The average latitude was calculated based on the latitude of the capital city of the country where the haplotype was present. Latitude data were collected from the SNP-seek database. OsGELP genes containing HLHs and showing significant differences based on Duncan’s test (p < 0.05) compared with another haplotype were selected during GLA.

Extraction of haplotype and latitude data via SLA

As described above, SLA used the latitude of the prefecture of origin of each Indonesian and Japanese accession from the 3000 RG database to indicate the exact origins of the varieties. The latitude data were obtained from the Germplasm Catalogue 2010 from the Indonesian Ministry of Agriculture for the Indonesian accessions and the National Agriculture and Food Research Organization Genebank for the Japanese accessions. Seventy Indonesian accessions were used to represent low-latitude (11°S to 6°N) varieties, and 30 Japanese accessions were used to represent middle- to high-latitude (20°N to 45°N) varieties; these varieties are listed Supplementary Data 3 and 4. Fisher’s exact test was also conducted during SLA to identify significant associations within the indica and japonica subpopulations. Genes containing HLHs from accessions grown at an average latitude above 35°N in the japonica subpopulation were selected for further analysis.

Extraction of haplotype data from Hokkaido varieties based on next-generation sequencing data

Haplotype analysis of 22 varieties from Hokkaido (representing the northern part of Japan, with a latitude of 43°N) listed in Supplementary Data 7 was conducted on 10 selected OsGELPs (see Results). Whole-genome sequencing data for the 22 Hokkaido rice varieties are available from Fujino et al. (2021). The next-generation sequencing (NGS) data were mapped to the Nipponbare reference genome using Bowtie 2, and SNPs were called using GATK4 HaplotypeCaller. The resulting SNPs were further analyzed using TASSEL 5 and IGV software, with Nipponbare as the reference. The haplotype compositions of the 22 Hokkaido rice varieties were grouped based on SNPs obtained from NGS data, and the haplotypes and average latitudes were analyzed as described for the 3000 RG data.

Construction of a haplotype network

The SNP-seek database was used to conduct haplotype analysis of genes containing HLHs to investigate the relationship between HLHs and other haplotypes in the network. The haplotype sequences were created by aligning SNPs using ClustalW in MEGA 1138. The haplotype network for each gene was constructed to analyze the genealogical relationships among the haplotypes using PopArt. Haplotype diversity was calculated with DnaSP software version 639. The rice regional accessions were sorted into 12 subpopulations: aro, aus, admix, ind1A, ind1B, ind2, ind3, indica-X (indx), japx, subtrop, temp, and trop. In the SNP-seek Database, rice subpopulations aro, aus, and admix come from different regions, with aro being fragrant varieties from South Asia, aus being early-maturing and drought-tolerant from Bangladesh and Eastern India, and admix being hybrids found worldwide. indica subgroups (ind1A, ind1B, ind2, ind3, and indx) are mainly in South and Southeast Asia, with indx being mixed indica types. japonica admixed (japx) varieties combine japonica genetics and are found in East Asia and beyond. Subtropical japonica (subtrop) grows in mild climates, temperate japonica (temp) thrives in cooler regions with cold tolerance, and tropical japonica (trop) is suited to hot, humid areas with longer grains (SNP-seek database).

Phylogenetic analysis

Multiple alignment (ClustalW) of the GDSL amino acid sequences in rice and other species was conducted using MEGA 11. Unrooted phylogenetic trees were built with the Neighbor Joining method. The phylogenetic tree was created using Poisson correction, pairwise deletion, and 1000 bootstrap replicates.

Gene expression profiling and protein structure analysis

The gene expression profiles based on the normalized signal intensities (log2) of 105 OsGELP genes were collected from the Rice Expression Profile Database (RiceXPro; https://ricexpro.dna.affrc.go.jp/); expression data for the 10 remaining OsGELP genes were not available (OsGELP13, 28, 34, 39, 41, 48, 57, 60, 104, and 115) (Figures S9 and S10). Amino acid data were collected from the UniProt database (https://www.uniprot.org/), and the AlphaFold Protein Structure Database (AlphaFold DB; https://alphafold.ebi.ac.uk)40 was used for protein structure predictions and to collect the protein models. The models were constructed using AlphaFold3, an artificial intelligence algorithm developed by DeepMind. The 3D protein structure was visualized using ChimeraX41.