Introduction

Among the different families of flowering plants, phyllanthaceae is one of the largest family, which consists of 59 genera, 10 tribes, 2 subfamilies and around 2000 species1. This family was segregated from another family euphorbiaceae2,3. Among the different genera of phyllanthaceae, Phyllanthus is one of the largest genera of the Phyllanthaceae family, represented worldwide by some 700 well-known species, mainly distributed in the tropics and subtropics4. Phyllanthus acidus (star gooseberry) is an annual, erect, branched herb, distributed widely in south-east Asia in a wide variety of soils and climatic conditions without commercial cultivation.

The plant is small to medium, height is below 5 m with a spreading crown. The branches are arranged alternatively and each branch contains around 25–40 leaves. The leaves are bear in clusters, shape of the leaves are either ovate or ovate-lanceolate and the tips of the leaves are pointed. The colour of the leaves is green or dark green. Flowers are small, light pink in colour, and bear in both new and old shoots. Fruits are usually star-shaped with 6 to 8 lobs, colour is greenish yellow when young and light yellow when ripe. The taste of the fruit is usually sour with slight sweetness and contains one stony seed. Phyllanthus species have a wide range of medicinal properties such as anti-viral, anti-bacterial, antipyretic, anti-inflammatory, anti-hepatotoxic, anti-oxidant, and analgesic activities5. It contains many bioactive compounds including flavonoids, phenolics, tannins, alkaloids, kaempferol, gallic acid, and quercetin and also has good antioxidant properties. In Manipur north-east India, there is a rich diversity of star gooseberry trees and the indigenous people of the state have used the different parts of Phyllanthus acidus in the traditional system of medicines. Since there are no standard commercial methods of vegetative propagation, the farmers have been propagating the plant from the seeds, and due to this, there is large variability among the existing populations in morphology, plant shape, fruit colour, size of fruits as well as the chemical constituents of the fruits. But till now, there is no systematic research has been conducted to study the genetic diversity in wild populations to identify the elite types with horticulturally important fruit qualities.

In any breeding programme, exploration of available germplasm and identification of suitable genotypes is very crucial. To assist the breeders of Phyllanthus acidus fruits of north-east India, this present investigation was aimed to elucidate the phenotypic diversity among the natural populations and to select the elite types which may have very important horticultural characteristuics and may be useful for the breeders in developing new cultivars by using them as one of the parents in the genetic improvement programme. The goals of this study were to characterize and quantify the genetic variability of Phyllanthus acidus accessions using principal component and cluster analysis. It will be crucial in developing an effective breeding strategy for genetic improvement. The study aimed to provide theoretical references for selecting elite accessions and to guide the breeders and other stakeholders in mainstreaming the lesser-known fruit of biodiversity hot spots.

Materials and methods

In order to find out the diversity of Phyllanthus acidus Skeels in the natural populations in wild and semi-wild stage, and to identify the most promising one from them, this research study was conducted during 2020–2021 at Manipur, north-east India (Fig. 1).

Fig. 1
figure 1

Map of the experimental site.

Survey of germplasm

During the fruiting season of 2020–2021, a field survey was conducted in different locations of Manipur, north-east India comprising of 5 districts viz., Imphal east, Imphal west, Kakching, Bishnupur and Thoubal. From the preliminary survey, finally, 20 accessions have been selected in their natural population. Table 1 depicts the various accessions collected from different locations throughout the state of Manipur, north-east India along with their latitude, longitude and elevation. The samples comprised of leaves, flowers and fruits were brought to the post-harvest laboratory of the department of Horticulture, Aromatic and Medicinal Plants, Mizoram University, Aizawl, Mizoram, India for analysis of their physical parameters, chemical constituents, anti-oxidants and other bioactive compounds present therein (Figs. 2, 3, 4, 5, 6, 7, 8).

Table 1 Accessions of Phyllanthus acidus Skeels. and their sources.
Fig. 2
figure 2

Phyllanthus acidus tree.

Fig. 3
figure 3

Phyllanthus acidus phyllotaxy.

Fig. 4
figure 4

Phyllanthus acidus inflorescence.

Fig. 5
figure 5

Phyllanthus acidus leaves.

Fig. 6
figure 6

Phyllanthus acidus fruits on tree.

Fig. 7
figure 7

Phyllanthus acidus fruit.

Fig. 8
figure 8

Phyllanthus acidus cross-section.

Morphological characterization

From each location, randomly 20 numbers of samples were collected for analysis of the morphological characteristics of the leaves, flowers, and fruits. Standard protocols were followed for the estimation of morphological parameters. For measurement of the length and breadth of the leaves, inflorescence, flowers, fruits and seeds, digital vernier callipers were used. Systronics-211 leaf area meter (CID Bioscience, USA) was used for measuring the area of the leaves.

Biochemical characterization

For analyzing the biochemical characteristics of the ripened fruits of Phyllanthus acidus, the standard protocols were used. The total soluble solids were measured by using a Zeiss hand refractometer. AOAC6 standard protocols were followed for measuring the titratable acidity, reducing, non-reducing and total sugars of the fruit juices. The carbohydrate content of the fruits was estimated by using Anthrone’s reagent, while protein content was estimated by using Lowry’s method. Folin Ciocalteau standard protocols were followed for the estimation of phenols in the fruits. UV VIS spectrophotometer was used for estimation of the total chlorophyll present in the leaves of the plant. Molyneux7 standard protocol was used for estimation of the antioxidant activity of the fruits by the free radical scavenging ability of DPPH (1, 1-diphenyl-2- 95 picrylhydrazyl).

Data analysis

One-way analysis of variance (ANOVA) was made to determine the significance of the means of average data in each replication by using Duncan’s multiple range test (DMRT) using the software OPSTAT. Principal component analysis (PCA) was performed to determine the interrelations among the accessions using Origin 2024 learner’s edition. The open-source Geographic Information System (GIS) software, QGIS version 3.38.3, was utilized to plot geographical coordinates and construct the study area map.

Results and discussion

Physical characteristics

The physical characteristics of the leaves, flowers, fruits and seeds of Phyllanthus acidus are depicted in Tables 3 and 4 respectively with significance at P < 0.05. From the data presented in Table 3, it is observed that different leaf morphological characters showed significant variation across the accessions. As depicted in Table 2, among the different accessions of Phyllanthus acidus, the leaf length ranged between 4.07 and 5.27 cm with a coefficient of variation (CV) and standard deviation (SD) of 7.65 and 0.37 respectively. Among all the accessions, PAS-9 recorded the maximum value for leaf length (5.27 cm). Similarly, the highest leaf breadth was recorded in PAS-14 (2.60 cm) and the lowest was in PAS-16 (2.08 cm) with SD of 0.16 and a moderate CV of 6.65. The leaf perimeter ranged from 5.22 (PAS-10) to 9.39 cm (PAS-14) with an average of 6.86 cm and greater CV of 19.75 (Table 2).

Table 2 Descriptive statistics of quantitative traits of P. acidus Skeels accessions.

Chlorophyll, often associated primarily with photosynthesis and overshadowed by yellow/orange carotenoid pigments, may have significant but overlooked physiological effects in disease prevention8. Abundant in green fruits and vegetables that are part of our diet, chlorophylls and their derivatives show therapeutic potential, offering antioxidant, antimutagenic, anti-cancer, and anti-obesogenic benefits9. A study by Zhuo et al.10 revealed that chlorophylline 6–mediated PDT induces apoptosis in human bladder cancer cells, possibly by inhibiting superoxide dismutase activity and generating reactive oxygen species. In our present investigation, the highest total chlorophyll was recorded in PAS-6 (1.67 mg g−1) followed by PAS-9 (1.65 mg g−1), PAS-14 (1.61 mg g−1), and PAS-20 (1.56 mg g−1), while the lowest was recorded in PAS-10 (1.06 mg g−1) (Table 3). The variation in leaf characters among the accession may be due to several factors which include differences in the age of the plants, variation in the nutrient availability and differences in the agro-climatic conditions in the location where it grown. Our results are in close conformity with the studies of Murali11 who reported that the variation in leaf characters could have been a result of differences in the environment. In addition, the amount of nutrient availability, light intensity perceived, and amount of water absorption are different in each environment and may also impact the morphological parameters of the plant11.

Table 3 Leaf and Flower physical characteristics of the P. acidus Skeels accessions.

Floral qualities are distinctive to each accession and can serve as important morphological markers in breeding. The duration of 50% flowering was observed in the range of 7.33 to 14.67 days, while the duration of flowering was in the range of 23.67 to 33.67 days (Table 3) with the CV value of 21.51 and 11.26 respectively. Barua12 reported that the variation in the time of flowering may be due to the difference in soil nutrient status and changes in climatic conditions especially temperature and rainfall patterns. The flower length ranged between 0.15–0.25 cm and among all the accessions PAS-20 recorded the maximum flower length (0.25 cm) while PAS-11 recorded the minimum (0.15 cm) with CV of 14.30 (Table 2). Similarly, flower breadth ranged between 0.40 and 0.56 cm having a mean of 0.47 cm with SD 0.04 and CV of 9.42. Among all the accessions, PAS-9 recorded the highest (0.56 cm) and PAS-12 recorded the lowest (0.40 cm). Our study is in line of conformity with the studies of Barua12 who also reported similar results for flower length and breadth in Phyllanthus acidus.

The quality of fruit is a multifaceted character that relies on several factors. Table 4 shows the significant variation among the fruit’s physical parameters. Among the studied accessions of Phyllanthus acidus, the fruit weight ranged from 3.27 to 5.12 g with SD of 0.59 and CV 14.96 (Table 2) and PAS-20 recorded the significantly maximum value (5.12 g) for fruit weight, but it was statistically at par with PAS-1 (5.10 g) and PAS-6 (4.87 g). Among the various factors responsible for the increase of fruit weight, the biogenesis of naturally occurring growth-promoting substances viz. auxins, gibberellins and cytokinins plays a profound role13. The rich variation in fruit weight could be due to the highly heterozygous and diverse genetic backgrounds of parents. Since all the plants are of seed origin, there might be differences in the genetic makeup of the plants, which might have contributed the variations in fruit weight among the genotypes. In our study, the maximum weight in PAS-20 might be due to more synthesis of growth-promoting substances in the fruits of this accession. Similarly, among the accessions of Phyllanthus acidus, PAS-20 recorded the statistically maximum value for fruit length (19.28 mm) and fruit diameter (24.75 mm), while, PAS-16 recorded the minimum value for fruit length (15.06 mm) and fruit diameter (20.21 mm). The variation in fruit length and diameter may be attributed to differences in genetic features of the individual genotypes and soil and climatic conditions. Hazariks et al.14,15 and Rozar et al.16 in their study also observed statistical differences in fruit morphological characters among several accessions. It is observed from the data presented in Tables 2 and 4 that among the different accessions of Phyllanthus acidus, there were significant differences in the volume of the fruits. The fruit volume ranged between 3.68 and 5.40 cc with SD and CV values of 0.55 and 12.94 respectively. Among all the accessions, the maximum fruit volume was observed in PAS-20 (5.40 cc), while the minimum was recorded in PAS-15 (3.68 cc). The variation in fruit volume among star gooseberry genotypes may be due to differences in their genetic makeup and prevailing agro-climatic conditions, i.e. nutrients, soil, light, water and altitude under which the plants are growing. Singh et al.17, Hazarika et al.18 and Singh and Singh19 also reported variation in fruit volume among aonla genotypes from north-east India. Our study is in line of conformity with the findings of Hazarika and Ngurthankhumi20 in Phyllanthus acidus from north-east India. The accessions did not differ significantly with respect to the specific gravity of the fruits. However, among all the accessions, PAS-6 was recorded as highest (0.97 g/cc) and PAS-16 recorded the lowest value (0.88 g/cc). The pulp weight ranged between 2.81 and 4.64 g with SD and CV of 0.59 and 17.48 respectively (Table 2), which confirmed the presence of higher variation in the trait. Among the studied accessions, PAS-20 recorded the maximum value for pulp weight (4.64 g), while, the lowest was observed in PAS-16 (2.81 g). The accessions varied significantly for the edible portions and it ranged between 83.22 and 90.51% with SD and CV of 2.05 and 2.40 respectively. Among all the accessions, the maximum edible portion was observed in PAS-20 (90.51%), while the lowest was recorded in PAS-18 (83.22%). Our study is in line of conformity with the studies of Rozar et al.16 where they reported variation in edible portion percentage among a number of aonla accessions from north-east India.

Table 4 Fruit and Seed physical characteristics of the P. acidus Skeels accessions.

There was significant variation among the accessions with respect to seed parameters. The seed length ranged between 5.28–6.82 mm with SD and CV of 0.43 and 7.23 respectively. Among all the accessions, the statistically maximum seed length was recorded in PAS-20 (6.82 mm) and the minimum was in PAS-15 (5.28 mm) (Table 2). The seed diameter ranged between 6.65 and 8.22 mm with SD and CV of 0.45 and 6.15. Our study is in line of conformity with the studies of Barua12 who also reported similar results for the seed length of Phyllanthus acidus. Similar to our results, Hazarika et al.18 and Sharma21 also reported seed diameter of aonla in the range of 7.50–11.00 mm, and 9.27–12.34 mm respectively. The data presented in Table 4 depicts that the seed weight ranged between 0.49 and 0.63 g and the maximum was recorded in PAS-9 (0.63 g), but it was statistically at par with PAS-7 (0.62 g) and PAS-14 (0.62 g), while the minimum was recorded in PAS-20 (0.49 g). For an ideal variety, lower weight and small size of seed are the desirable characters. These observations revealed a positive correlation among pulp weight, seed weight and fruit weight. The genotypes produced higher fruit weight may be due to higher pulp weight and less seed weight. This clearly indicated that, during the selection of any genotype based on fruit, the breeder should emphasize on fruit pulp content rather than fruit weight alone20. The seed volume ranged from 0.85 to 1.03 cc with SD and CV of 0.06 and 6.05 respectively. The pulp: seed ratio ranged between 4.97–9.55 and exhibited higher variability with CV value of 21.32 and among all the accessions, PAS-20 recorded the significantly highest value (9.55) and PAS-18 recorded the lowest (4.97). The Pulp-to-stone ratio is a vital factor in identifying a superior genotype by breeders. These results align with previous studies conducted by Hazarika and Laltluangkimi22 and Chandra et al.23.

Biochemical characteristics

It is obvious from the data presented in Table 5 that Phyllanthus acidus accessions varied significantly with respect to biochemical parameters of the fruits. The moisture content ranged between 81.13 and 91.65% with coefficient of variation of (3.88) (Table 2). Among all the accessions, the significantly maximum moisture content was recorded in PAS-10 (91.65%), while the lowest was in PAS-20 (81.13%). The variation in moisture content may be due to the fact that all accessions have grown in different soil and climatic conditions, having variations in stage of maturity and time of harvest which as a whole impacted the moisture content of the fruits. Our study is in line of agreement with the study of Jahan et al.24 who reported that the variation in the moisture content among the accessions might be due to several factors such as the rainfall received,available soil moisture level as well as the type of the soil which together greatly influence the quantity of moisture present in the fruits. The data presented in Table 5 revealed that there was significant variation among the accessions with respect to juice content of the fruits. The juice content of the fruits ranged between 25.25–46.11% with SD and CV of 7.12 and 20.75 respectively. The highest juice per cent was recorded in PAS-1 (46.11%), while, the lowest was recorded in PAS-19 (25.25%). The rich variation in moisture and juice could also be due to highly heterozygous and diverse genetic backgrounds of the parents20.

Table 5 Biochemical characteristics of the P. acidus Skeels accessions.

Total soluble solids in fruits are critical for evaluating ripeness, quality, flavour, nutrition, and suitability for processing. TSS monitoring ensures that fruits and their products align with consumer expectations and industry standards25. Among all the accessions, the highest TSS (%) was recorded in PAS-6 (9.26%) and the lowest was in PAS-15 (6.76%). Breeders should also prioritize selecting elite varieties based on the TSS content of the fruits as it acts as the major factor in organoleptic acceptability. Singh et al.17, Hazarika et al.18, Sharma21 and Rozar et al.16 observed differences in TSS among aonla genotypes from north-east India. Mishra et al.13 also reported that with the advancement of fruit maturity, there is an increase in the TSS content of the fruits due to the breakdown of polysaccharides and the formation of monosaccharides and simple sugars. The variation in TSS among the genotypes may be due to different genetic make-up of the individual genotypes and agro-climatic conditions. The fruits growing in arid regions with limited availability of water tend to accumulate more, and thus had the higher TSS in fruits26, as also observed in the present study. The breeders during the selection of superior genotypes should emphasize total soluble solids content of the fruit.

The titratable acidity in fruits enhances their flavour, aids in preservation, boosts nutritional value, and potentially offers health benefits. This characteristic defines their culinary appeal, making them valuable in diverse cuisines and traditional remedies25. Titratable acidity exhibited significant variability in the range of 1.2–1.81% with a moderate CV of 12.12 (Table 2). Our results are in close conformity with the studies of Shukla et al.27 who also reported variation in acidity among different germplasm of aonla. In our study, since all Phyllanthus acidus accessions have been grown in different agro-climatic conditions there might be variation in titratable acidity among the accessions. Mishra et al.13 reported that the accessions, where there is higher synthesis of organic acids always have higher titratable acidity, and the accessions where there is more bioconversion of organic acids to sugars always have lower in titratable acidity. Our study is in line of conformity with the studies of Rozar et al.16 where they reported variation in titratable acidity among several aonla accessions from north-east India.

Vitamin C plays a pivotal role in maintaining overall health and well-being due to its diverse range of benefits. With its antioxidant properties, immune-boosting effects, involvement in collagen formation, and various other functions, it is a vital nutrient that significantly contributes to the myriad health advantages associated with the consumption of fruits25. It is noteworthy that high-dose consumption of vitamin C is associated with reduced risks of developing cancers in the oral cavity, stomach, oesophagus, pancreas, cervix, breast, and rectum28. Additionally, high-dose vitamin C has the potential to alleviate cancer-related pain and enhance the overall quality of life for cancer patients29. The significantly highest ascorbic acid was observed in PAS-9 (44.70 mg/100 g) and the lowest was in PAS-3 (29.3 mg/100 g). The accessions may vary in their ascorbic acid content due to genetic differences among them. The accessions where there is more supply of hexose sugars in photosynthetic activity may always attributed to higher ascorbic acids13. It is a fact that, if TSS increases, the ascorbic acid also increases because the precursor of ascorbic acid is glucose- 6-phosphate30, which was also confirmed in our study. Our study is in line of conformity with the studies of Rozar et al.16 where they reported variation in ascorbic acid content among a number of aonla accessions from north-east India.

Significant variation was exhibited among the accessions with respect to total sugars of the fruits. The total sugars of the fruits ranged between 3.45–5.37% with SD and CV of 0.56 and 12.58 respectively (Table 2). The significantly highest total sugars were recorded in PAS-1(5.37%), while it was lowest in PAS-3(3.45%). Hazarika et al.18 and Singh et al.17 also reported variation in total sugars of aonla accessions in the range of 5.57–12.15 and 7.94–13.15% respectively. The reducing sugar of fruits ranged from 2.45 to 4.01% (Table 5) with a higher variability of CV 13.05. Among all the accessions, PAS-14 and PAS-3 recorded the highest (4.01%) and lowest values (2.45%) for reducing sugars respectively. Sharma21 also reported variations in reducing sugars. There was no significant variation among the accessions with respect to non-reducing sugars. However, among all the accessions, PAS-13 and PAS-1 recorded the highest (1.75%) and lowest value for non-reducing sugars (1.04%). Our study is in line of conformity with the studies of Barua12 who also reported similar values of non-reducing sugars in Phyllanthus acidus. Significant variability was exhibited among the accessions for sugar: acid ratio and TSS: acid ratio of the fruits and ranged between 2.00–4.34 and 4.01–7.59 with CV of 23.31% and 22.84% respectively (Table 2). Among all the accessions, PAS-14 exhibited the highest sugar: acid ratio (4.34) and PAS-9 recorded the highest TSS: acid ratio (7.59) (Table 5). Debbarma and Hazarika31,Singh et al.17 and Singh and Singh19, reported significant variations in sugar: acid ratio and TSS: acid ratio among different accessions of bael and aonla from north-east India.

The carbohydrate content found in fruits serves multiple vital functions, including providing energy, supporting digestive well-being, managing blood sugar levels, and facilitating nutrient absorption. This component is a cornerstone of a well-rounded diet, enhancing both the overall nutritional profile and culinary versatility of fruits25. The data presented in Table 5 revealed the variation in total carbohydrates among the Phyllanthus acidus accessions. The total carbohydrates of the accessions ranged between 5.07–6.59 g 100 g−1 with the SD and CV values of 0.52 and 9.01 respectively. Among the studied accessions, the highest value was recorded in PAS-6 (6.59 g 100 g−1) while the lowest was in PAS-15 (5.05 g 100 g−1). Suriyavathana and Subha32 and Barua12 also reported similar results for the carbohydrate content of Phyllanthus acidus fruits.

Proteins serve as fundamental structural and functional elements within all living cells. Approximately half of the protein content within our body is found in muscle tissue, with the remainder distributed among bones, cartilage, and skin33. Similarly, the protein content of the accessions ranged between 1.55–3.57 g 100 g−1 with the SD and CV values of 0.54 and 20.34 respectively. Among all the accessions, the significantly highest total protein content was recorded in PAS-6 (3.57 g 100 g−1), while it was lowest in PAS-5 (1.55 g 100 g−1). Our results are in line of conformity with the findings of Rozar et al.16 where they reported variation in protein content among a number of aonla accessions from north-east India.

Carotenoids have demonstrated their efficacy in both preventing and treating various diseases, owing to their non-toxic properties, as highlighted by Zare et al.34. Numerous studies have proposed that carotenoids can exert their anti-cancer effects through a range of mechanisms. These include acting as antioxidants and pro-oxidants, mitigating inflammation, inhibiting angiogenesis, modulating the immune system, promoting cell differentiation, and curbing cell proliferation35,36. The carotenoid content of the fruits varied significantly among the accessions and it ranged between 2111.94–2488.35 µg 100 g−1. Among all the accessions, PAS-1 and PAS-2 exhibited the highest (2488.35 µg 100 g−1) and lowest values respectively (2111.96 µg 100 g−1). Fitriansyah37 reported that the strong yellow to orange colour of aonla fruits is due to the presence of carotenoids.

Among the studied accessions, the total phenols ranged between 25.30 and 34.82 mg GAE g−1 with a moderate CV of 8.89. Among all the accessions, PAS-14 exhibited the highest (34.82 mg GAE g−1) and PAS-17 recorded the lowest value (25.30 mg GAE g−1) for total phenols. The variation in the total phenol content of the Phyllanthus acidus fruits may be due to differences in ripening stage and agro-climatic conditions. Our results are in line of conformity with the findings of Rozar et al.16 and Barua12 where they reported variation in protein content among several aonla and Phyllanthus acidus accessions from north-east India.

The data displayed in Table 5 revealed that the accessions varied significantly with respect to total flavonoids of the fruits. The flavonoids of the fruits ranged between 13.94–24.68 mg GAE g−1 with SD and CV values of 3.75 and 19.96 respectively. Among all the accessions, the significantly highest flavonoids was recorded in PAS-14 (24.68 mg GAE g−1), while it was lowest in PAS-5 (13.94 mg GAE g−1). Our study is in line of conformity with the findings of Foyzun et al.38 who reported similar results in total flavonoid content of Phyllanthus acidus fruits. The highest DPPH antioxidant activity of fruits was recorded in PAS-1 (90.54%) and the lowest was in PAS-18 (75.52%). Pradeep et al.39 reported that fruit extract of Phyllanthus acidus exhibited more than 90% of inhibition. Our results are in line of conformity with the studies of32 who reported that the DPPH inhibition percentage ranged from 87.89 per cent to 90.78 per cent among Phyllanthus genotypes and can be recommended as a source of natural antioxidants to fight against vulnerable diseases.

Correlation among different physico-biochemical characters

The information about the important characters among several accessions can be obtained by analyzing the correlation coefficient. Pearson correlation analysis revealed substantial (p < 0.05) positive and negative relationships between the characters. The results of the correlation analysis between different physico-biochemical parameters are displayed in Fig. 9. In the current study the correlation coefficient ranged between -0.82 to 0.98 (Fig. 9). A highly significant correlation was observed between leaf length with leaf area (r = 0.82), leaf perimeter (r = 0.80), total flavonoids (r = 0.77) and highly negative correlation with moisture content (r = -0.73). However, leaf length didn’t show any correlation with seed weight (r = 0.00). Leaf breadth exhibited a highly positive correlation with leaf area (r = 0.77) and leaf perimeter (r = 0.78). Leaf area exhibited a highly positive correlation with leaf perimeter (r = 0.98), total chlorophyll (r = 0.86), fruit weight (r = 0.82), pulp weight (r = 0.8), ascorbic acid content (r = 0.79), total phenols (r = 0.8) and total flavonoids (r = 0.82). As displayed in Fig. 9, it is observed that there was a substantial correlation among the fruit physico-biochemical characters. Fruit weight was positively correlated with fruit diameter (r = 0.87), fruit length (r = 0.93), pulp weight (1.00), seed length (r = 0.87) and TSS (r = 0.92). TSS was positively correlated with leaf area (r = 0.79), total chlorophyll (0.83), fruit weight (r = 0.92), seed length (0.81), total carbohydrates (r = 0.80) and total proteins (0.75). Dangi et al.42 also reported similar correlation coefficients among different quantitative traits of sweet cherry cultivars.40 also observed positive correlations of the leaf length with leaf width,leaf area, and TSS and negative correlation with acidity. In the present study, the fruit weight and diameter of the fruit have a positive correlation with length of the fruit, suggesting that larger fruits typically have larger dimensions. The highly positive significant correlation of fruit weight with fruit dimension indicates that fruit weight is highly influenced by the length and width of the fruit41. Our study is in agreement with the findings of Dangi et al.42 who reported a highly positive correlation of fruit weight with fruit length and diameter. As illustrated in the current investigation, Ganopoulos et al.43, Khadvi et al.44, and Srivastava et al.45 also reported a highly positive correlation of fruit weight with fruit length and diameter.

Fig. 9
figure 9

Correlation matrix among different physical and biochemical characteristics.

Carotenoids showed positive associations with the majority of fruit characteristics, suggesting that specific biochemical traits could be linked to the dimensions of fruits and seeds. Certain fruit quality indicators, such as total sugar and TSS, have positive stimuli with one another, indicating that their levels may be influenced by one another. Moisture content and acidity show highly negative correlations with various characteristics indicating an inverse association with several fruit characteristics. Some variables, like seed weight and juice content, exhibited weak correlations with most of the other fruit characteristics, suggesting less pronounced relationships with other fruit characteristics. The correlations between fruit characteristics and biochemical properties suggest complex interrelationships that may be influenced by genetic factors, environmental conditions, and cultivation practices. Certain traits, such as fruit weight and size, appear to be closely related, indicating potential genetic control over these attributes. Understanding these correlations can help the breeders about the fruit quality improvement.

Principal component analysis (PCA) of different physico-biochemical characteristics

Principal component analysis (PCA) is used to determine the association between different qualities within the sub-sets and the link between genotypes46. Principal component analysis (PCA) stands out as a vital tool for assessing and categorizing genotypes by identifying and quantifying crucial traits, as it simplifies complex physico-biochemical data, providing insights in understanding the variables. PCA permits us to achieve a dimensionality reduction, data exploration for finding relationships between objects, an estimation of the correlation structure of the variables and an investigation of how many components (a linear combination of original features) are necessary to explain the greater part of variance with a minimum loss of information47. It has been used to evaluate various comprehensive traits of pomegranates, bael, sweet cherry, Prunus species and Garcinia pedunculata31,47,48,49,50,51.

In our study, based on PCA analysis 20 principal components (PC) were formed as shown in Fig. 10. The 5 PCs having eigenvalue of more than 1 were considered for interpretation as the cumulative contribution rate of the 5 PCs concluded 86.66% (Table 6) that explains most of the physico-biochemical characteristics of the Phyllanthus acidus accessions. Among the 5 PCs, PC1 accounted for 67.16% of the variability, PC2 for 6.68%, PC3 for 5.05%, PC4 for 4.29% and PC5 for 3.48% of the variability. The results illustrated that PC1 exhibited a significant and positive correlation with most of the physico-biochemical characters. It showed the highest positive loading with fruit characters such as fruit weight (0.974), fruit volume (0.975), pulp weight (0.968) and TSS (0.940) of the fruit. However, a negative correlation was observed with duration of flowering, moisture content and acidity. PC2 loaded significantly positive correlation with leaf length, leaf area, leaf perimeter, ascorbic acid, total sugar, total flavonoid and seed weight which was not prominently featured in PC1 whereas it exhibited a negative correlation with fruit and seed physical characters, pulp seed ratio, acidity and carotenoids contents. The biochemical characters of Phyllanthus acidus viz. total proteins, carotenoids and antioxidant activity were highly positively linked in PC3 with a variance of 5.05%. PC4 explained 4.29% of the total variance captured by the PC analysis. Among the total variance, leaf breath, leaf area, specific gravity, seed weight showed higher positive loadings and moisture, total sugar, total protein and antioxidant activity exhibited higher negative loadings. PC5 was positively loaded with duration of flowerings, seed weight, seed volume, total sugars and carotenoids. Former studies utilized PCA as a tool for characterizing and evaluating germplasms40,44,52. In our earlier research, we also reported similar variability in Aegle marmelos and Garcinia pedunculata, which is in consistent with the results of the present study31,47. Dangi et al.42 also observed variation among sweet cherry accessions which was illustrated by scree plots, scatter plots along PCA biplots explaining 40% of the total variation. Integrating characteristics such as leaf area, fruit weight, pulp weight, TSS, pulp seed ratio and others as reflected within the PCs highlights their importance as essential quantitative variables. Furthermore, their inclusion within the PCs emphasizes their crucial role in capturing the variability and enhancing our understanding of the characteristics and quality of Phyllanthus acidus accessions.

Fig. 10
figure 10

Scree plot showing the Eigen values and Principal component (PC) numbers resulting from the PC analysis.

Table 6 Principal components (PC) loadings of the P. acidus Skeels accessions.

The scatter biplot (Fig. 11) segregates the accessions and characters in PC1 and PC2. It is employed to recognize and contrast accessions with significantly diverse traits, potentially serving as valuable parental candidates in breeding initiatives. The biplot illustrates that accessions PAS-6 and PAS-15 had the highest positive scores while PAS-10 had the largest negative score in PC1 axis. PAS-14 exhibited the highest positive score which is opposed by PAS-3 on PC2 axis. In Fig. 11, there were 3 accessions with both positive PC1 and PC2 values namely PAS-14, PAS-9 and PAS-6, showing the distinctiveness as the variability is proportional to the vectors. Moreover, these accessions are associated with the weight of the fruit, length of the fruit, fruit volume, pulp: seed ratio and TSS of the fruits. In the negative PC1 and positive PC2 quadrant, 9 accessions were depicted. Six accessions, namely PAS-3, PAS-11, PAS-5, PAS-10, PAS-17, and PAS-2, were illustrated in both the negative PC1 and PC2 planes. In regions where PC1 is positive and PC2 is negative, PAS-1 and PAS-20 are notably distinct from other observations.

Fig. 11
figure 11

Scatter biplot of the accessions on the first and second principal components based on physico-biochemical characteristics.

Cluster analysis

The process of grouping a set of data into clusters based on how similar or dissimilar individuals are to one another is known as cluster analysis. Using cluster analysis, comparable groupings were sought out. In contrast to PCA, all attributes are used equally in this analysis. Cluster analysis assists as a key tool in discerning genetic diversity within a crop population, facilitating the identification of distinct groups based on genetic similarities. In the present investigation, the cluster analysis was performed among the accessions using the Wards method based on euclidean distance using Origin 2024 learner edition (Fig. 12). The resultant dendrogram unveils a plethora of clusters denoting differences among the accessions, adeptly capturing the comprehensive spectrum of variations inherent within the Phyllanthus acidus population. It divided the accessions into 3 major clusters (I, II and III). The first major cluster I (PAS-1, PAS-14, PAS-20, PAS-6, PAS-9, PAS-19) includes 6 accessions. The second major cluster II facilitated 9 accessions in two smaller clusters of 4 and 5 accessions viz. PAS-2, PAS-5, PAS-10, PAS-17 and PAS-3, PAS-13, 7, 15, PAS-12, respectively. The third major cluster III reflected 5 accessions (PAS-4, PAS-11, PAS-8, PAS-16, PAS-18) under its category. Rai and Misra53 and Debbarma and Hazarika31 intriguingly classified bael genotypes into three clusters, attributing such patterns to variances in the genetic constitution and enduring environmental influences. Studies on pomegranate48 and walnut54 have advocated for the application of cluster analysis to assess variability among the accessions. Exploiting the cluster dendrogram as a foundation, it becomes apparent that the populations manifested notable variability. Concurrently, through the analysis in the current investigation, variability was discerned across all physico-biochemical traits, indicating a substantial degree of phenotypic polymorphism among the accessions. This observation underscores the existence of diverse morphotypes at the individual accession level, presenting abundant opportunities for obtaining desirable trait combinations while developing specific cultivars of Phyllanthus acidus. Such findings hold pivotal importance in the breeding program, to mainstream this lesser-known fruit of north-east India and increase the fruit basket for the consumers.

Fig. 12
figure 12

Cluster map of Phyllanthus acidus Skeels accessions using wards method based on Euclidean distance.

Conclusion

A genetic variability study is an initial step aiming to characterizing and conservation of the precious germplasm for future breeding programs. Identification of promising Phyllanthus acidus accessions based on genetic variability, and phenological and pomological characteristics will be useful to harness the economic advantage associated with this valuable crop. Consumer preferences for fruits with maximum fruit weight, length and breadth, maximum pulp weight, pulp-to-peel ratio, juice content, ascorbic acid, TSS, low acidity, higher antioxidant activity and sugar acid ratio. The present investigation on Phyllanthus acidus accessions exhibited significant variability in the studied physico-biochemical characters, thereby offering a valuable resource for including them in breeding programmes and mainstreaming them. From the results of the present investigation, it can be concluded that among all the studied accessions 7 accessions have been revealed with greater variability with all the desired horticultural characteristics viz. PAS-14, PAS-9, PAS-6, PAS-1, PAS-20, PAS-3, PAS-18 and can be considered as elite accessions and be used as potential parents for future breeding Programmes. Expanding the cultivation of these Phyllanthus acidus accessions holds great promise for enhancing local nutrition and well-being, with a particular focus on their unique potential in traditional healthcare. However, further studies are necessary to understand the underlying molecular mechanisms that control these morphological characters. Such studies could facilitate the identification of key genes involved in morphological traits and enable the development of molecular markers for selecting desirable genotypes. Additionally, further research could investigate the relationship between morphological traits and fruit quality attributes.