Introduction

Orah mandarin, a widely consumed citrus fruit, is predominantly evaluated for quality based on two critical parameters: soluble solids content (SSC) and moisture content (MC). SSC influences the sweetness of the fruit and serves as a vital indicator of fruit maturity. MC is closely associated with the freshness and shelf life of the fruit1. Traditional detection methods, such as refractometry, are often inefficient and may damage samples, leading to increasing interest in spectroscopic analysis2. Spectroscopic techniques offer rapid and non-destructive methods for assessing the internal quality of citrus fruits. Various technologies, including infrared spectroscopy3,4, Raman spectroscopy5,6, fluorescence spectroscopy7,8, terahertz spectroscopy9,10, and nuclear magnetic resonance (NMR)11,12, have been explored for detecting the internal quality of citrus fruits. Among them, near-infrared spectroscopy (NIR) is the most widely used technique for efficiently analyzing a range of chemical components (such as sugars and acidity) in fruits, including apples13, pears14, navel oranges15, cherries16, without damaging the samples. Other spectroscopic techniques, however, have limited application due to their susceptibility to interference from factors such as fluorescence, sample moisture content, and temperature, as well as high equipment costs. NIR evaluates fruit quality by measuring the overtones and combination bands of absorption information of hydrogen-containing groups X-H (such as C-H, N-H, O-H), combined with chemometric methods to establish a mapping relationship between spectral information and test indicators.Tian et al.17 developed an effective SSC prediction model using a portable Vis/NIR spectrometer and the SNV-SPA-PLS algorithm. Luo et al.18 applied HSI to predict SSC in Nanfeng mandarin, using preprocessing techniques such as MSC and SG, and then built models using PLSR and LSSVM. Nevertheless, this technology presents several challenges: (1) The modeling process lacks transparency, and improvements are needed to enhance the model’s stability and adaptability, as it is highly sensitive to variations in sample diversity and maturity19, as well as instrument performance20 and environmental conditions21. (2) During light propagation through turbid tissues, both absorption and scattering occur simultaneously. Spectroscopy methods based on the Lambert-Beer law primarily consider light absorption within tissues, neglecting the effects of scattering, which leads to elevated detection error rates22. (3) The spectral signals obtained through spectroscopy represent a combined effect of tissue absorption and scattering, without distinguishing between these two phenomena. The precise impact of scattering on absorption at different wavelengths, and its subsequent influence on spectral data and model variations, remains unclear. As a result, despite significant advancements in spectroscopy for non-destructive quality assessment of agricultural products23, a comprehensive understanding of the underlying optical detection mechanisms remains lacking.

In fact, the propagation of light in fruits involves both absorption and scattering24. The interactions between light and fruits can be characterized by the optical properties of the tissue, which include the absorption coefficient (μa) and the reduced scattering coefficient (μ′s)25. Absorption is related to the chemical composition of the tissue, while the reduced scattering coefficient is related to the microstructure of the tissue26,27. Many studies have explored using optical properties to assess the quality of fruits. For example, Fang et al.28 measured the optical properties of apples within the 905 nm to 1650 nm range, investigated the relationship between these properties and soluble solids content (SSC), and developed an SSC prediction model based on the absorption spectrum. This model achieved a determination coefficient (R²) of 0.833 and a root mean square error (RMSE) of 0.329. In contrast, the prediction model based on the reduced scattering spectrum exhibited a slightly lower performance, with a determination coefficient (R²) of 0.726 and an RMSE of 0.722. Similarly, Ma et al.29measured the optical properties of peaches in the 400 nm to 1050 nm range and analyzed the relationship between these properties and peach flesh hardness. They found that the hardness prediction model based on spectral data outperformed the model based on the absorption spectrum, achieving a determination coefficient (R²) of 0.877 and an RMSE of 4.172. Additionally, Y. Liu et al.30measured the optical properties of nectarines within the 400 nm to 1050 nm range and analyzed their relationship with SSC based on absorption and reduced scattering. Using Partial Least Squares (PLS) and one-dimensional convolutional neural network (1D-CNN) models, they determined that the 1D-CNN model based on absorption performed best in predicting SSC, with a determination coefficient (R²) of 0.938 and an RMSE of 0.446. There have been numerous reports on the use of visible-near infrared spectroscopy technology to determine the internal components of citrus fruits, including SSC31 and pH32.Sun et al.33 measured the optical properties of different citrus tissues (flavedo, albedo, and juice vesicles) in various fruit types (grapefruit, orange, lemon, and lime), demonstrating a correlation between the overall optical properties and different quality attributes. However, to the best of our knowledge, there has been no quantitative analysis of the relationship between optical properties and SSC and MC at different wavelengths. Such fundamental research is crucial for elucidating the role of optical techniques in assessing SSC and provides important insights for improving the accuracy and adaptability of predictive models.

The primary goal of this study is to elucidate the quantitative relationship between the optical absorption and scattering properties of the “Wuming” Orah mandarin during storage and its SSC and MC, and to clarify the rationale for using optical techniques to detect SSC. Specifically, the study aims to: (1) measuring μa and μ′s of apple flesh in the 500–1050 nm range using single integrating sphere(SIS) system; (2) quantifying the relationship between μa, μ′s, and SSC and MC at various wavelengths; and (3) constructing predictive models for SSC and MC.。.

Materials and methods

Samples

The samples were purchased from Guangxi Beiliu Xunwei Agriculture Company in April 2024 and then sent to the laboratory via express delivery. Before the experiment, the samples were stored at a temperature of 20 ± 1 °C and a relative humidity of 90 ± 5% for 24 h to ensure the temperature of the oranges matched the ambient temperature, thereby minimizing the impact of temperature on the experimental results. For the experiment, 160 samples with an equatorial diameter ranging from 60 to 70 mm, all uniform and without external defects, were selected.The samples were stored in a constant temperature and humidity chamber (25 °C, RH > 85%) for 6 days, and 40 samples were randomly selected for measurement every 2 days.

The sample preparation method referred to the study by sun et al.33, as shown in Fig. 1. Around the equator, from the outermost to the innermost layers, a slicer was used to sequentially remove slices of the exocarp measuring approximately 3 cm x 3 cm x 0.1 cm (W * H * D) from the whole fruit (marked as 1 in Fig. 1) and slices of the enocarp of roughly the same size (marked as 2 in Fig. 1). The thickness (D) of the slices was precisely measured with a micrometer screw gauge. For the juice vesicle section, several segments were first processed through a stainless steel press to prepare the juice, with the extracted juice stored in a container for measuring SSC. Additionally, a few drops of the squeezed juice were transferred to a glass vessel with a space thickness of 5 mm, and then using tweezers, individual separated juice vesicles were gently placed one by one into the juice (marked as 3 in Fig. 1). The juice filled the gaps and pores between the juice vesicles to avoid air bubbles and to maintain the original shape of the juice vesicles.

Fig. 1
Fig. 1
Full size image

Preparation process of Orah mandarin tissue sample.

Optical properties measurement

Spectral acquisition system

The optical properties of the pulp tissue were obtained by a self-built SIS system combined with the IAD method. The SIS system includes a computer, integrating sphere, spectrometer, light source, and stage, as shown in Fig. 2. The integrating sphere has a diameter of 83.82 mm (4P-GPS-033-SL, Labsphere, USA) and its inner walls are coated with a light reflective paint with a reflectivity of 98%, allowing for uniform dispersion of light. Two ports with a diameter of 25.4 mm are opened at 180° apart on the sphere’s equator to serve as the light entrance and exit. The port covers (PP-100-SL) are also coated with reflective paint, covering the exit of the integrating sphere. A 1000 μm diameter optical fiber transmits the signal to the spectrometer, which collects spectral data using the spectrometer (QE65PRO, Ocean Optics, USA). The light source is a high-power halogen light source, 24 V, with an output power of 150 W.

Fig. 2
Fig. 2
Full size image

Illustration of SIS system. The transmittance mode (a) and reflectance mode (b).The 3D illustration was generated using PTC Creo Parametric 8.0 (https://www.ptc.com/en/products/creo).

System validation tests

Water is considered to be a pure absorptive liquid. Intralipid − 20% solution has a uniform particle size distribution and is a stable emulsion. Therefore, both are commonly used to verify the accuracy of integrating sphere measurement systems as absorbers and scatterers. The absorption coefficient of pure water and the reduced scattering coefficient of a 1% concentration of intralipid − 20% solution are determined and compared with values reported in the literature to verify the accuracy of the system. The reference values are from empirical data by Bukata et al.34 and Van et al.35 The formula is as follows:

$${\mu _{sref}}(\lambda )=0.016{(\lambda /1000)^{ - 2.4}}$$
(1)
$${g_{ref}}(\lambda )=1.1 - 0.58\lambda /1000$$
(2)
$$\mu _{{sref}}^{\prime }(\lambda )={\mu _{sref}}(1 - {g_{ref}})C\%$$
(3)

Orah Mandarin optical properties measurement

Before the experiment, the system needs to be preheated for half an hour to stabilize the instruments. Reference light spectrum and dark light spectrum are also needed to calibrate the system. To measure the reference reflection spectrum, first turn off the light source and cover the output port of the integrating sphere with a port cap, then push the slide table to move the condenser lens into the integrating sphere until it cannot be pushed further, and measure the dark reflection spectrum \(\:{R}_{d}\) at this time; turn on the light source to measure the bright reference reflection spectrum\(\:{\:R}_{r}\). When measuring the transmission reference spectrum, first turn off the light source and cover the output port of the integrating sphere with a port cap, move the sample rack holding two glass slides to the entrance of the integrating sphere, keep the condenser lens in the same position and measure the dark transmission spectrum \(\:{T}_{d}\) at this time; then turn on the light source to measure the bright reference transmission spectrum \(\:{T}_{r}\).

When measuring the reflection spectrum, place the sample at the output port of the integrating sphere and push the slide table to move the condenser lens into the integrating sphere. At this point, the front end of the adapter should be about 10 mm from the sample rack, and the reflection spectrum \(\:{R}_{s}\) can be obtained; when measuring the transmission spectrum, cover the outlet of the integrating sphere and place the mandarin orange sample at the entrance port of the integrating lens, keeping the condenser lens in the same position, then the transmission spectrum \(\:{T}_{s}\)can be obtained. Use formulas (4) and (5) to calculate the reflectance and transmittance of the samples based on their reflection and transmission spectra.

$$\:{R}_{s}=\frac{{R}_{s}-{R}_{d}}{{R}_{r}-{R}_{d}}$$
(4)
$$\:{T}_{s}=\frac{{T}_{s}-{T}_{d}}{{T}_{r}-{T}_{d}}$$
(5)

To avoid the influence of ambient light on the optical parameter values, the whole experiment was completed in a dark room environment.The reflectance and transmittance of each sample is measured at 3 points, and the average of the results of the 3 points is used as the measurement results of the sample. The IAD algorithm36 is used to calculate the \(\:{{\upmu\:}}_{\text{a}}\:\)and \(\:{{\upmu\:}}_{\text{s}}^{{\prime\:}}\) of the samples. The wavelength range of 500–1050 nm was selected for this study due to its relevance to the optical properties and components of the fruit37,38.

Internal quality measurement

After the measurement of optical properties was completed, the SSC of the juice of the piece of mandarin orange was measured using a digital refractometer PAL-1 manufactured by ATAGO Co. Ltd. in Tokyo, Japan.The SSC of each sample is repeated three times, and the average value of the several repeated measurements is taken as the determination result.

The MC is determined by the drying method. A small piece of tissue (2–3 g) is taken and placed in an aluminum drying vessel. The vessel is sealed and weighed using an electronic balance accurate to 0.001 g. After weighing, the vessel is opened, and the tissue is then dried in an oven at 70℃ for 48 h until it reaches a constant weight. Once dried, the tissue is removed and re-weighed using the electronic balance. The difference in weight before and after drying represents the moisture content, which, when divided by the initial mass before drying, yields the tissue’s moisture content. The formula is as follows:

$${\text{MC/}}\% = \frac{{m_{1} - m_{2} }}{{m_{1} }} \times 100$$
(6)

m1 and m2 represent the masses of the sample before and after drying in grams.

Sample division and modeling

Before model establishment, the samples should be divided into calibration and prediction sets at a specified ratio. The K-S algorithm is employed to allocate the sets, maintaining a 3:1 ratio between them. Tables 1 and 2 detail the number and range of samples in the calibration and prediction sets, respectively. The sample data spans a broad range, ensuring the model’s enhanced representativeness.

Table 1 Division of the SSC sample set.
Table 2 Division of the MC sample set.

Partial Least Squares Regression (PLSR) is a widely used multivariate linear correction technique, particularly in the field of visible/near-infrared spectroscopy analysis39, for establishing regression models.The method can handle both the independent variable matrix X (spectral matrix) and the dependent variable matrix Y (concentration matrix). By compressing and extracting the information between the independent and dependent variables, a linear relationship between the potential variables is established, which enables data downscaling and predictive analyses. In this study, the independent variables are optical property parameters and the dependent variables are SSC and MC.PLSR projects the optical property data onto the latent variables (LVs) and avoids underfitting or overfitting using methods such as simple cross-validation.

Data analysis

Spectral preprocessing can improve model performance. Problems such as vibration, noise, and the external environment of the equipment can affect the accuracy of the optical information, making it possible for the optical information to contain information that is not relevant to the sample, leading to problems with drift, translation, and noise in the optical information. Currently, these effects are eliminated by spectral preprocessing methods. In this study, three spectral preprocessing algorithms including Multiple Scattering Correction (MSC), Standard Normal Variable Transform (SNV) and Baseline Baseline Correction have been implemented using‘Unscrambler V9.7’(CAMO PROCESS AS, Oslo, Norway). SNV and MSC have a similar function of reducing physical variability between samples due to scattering40. Baseline correction is commonly used to remove baseline drift from a spectrum, making the spectrum more accurate and reliable41. One-way ANOVA (p < 0.05) and Pearson correlation analysis (p < 0.05) were conducted in IBM SPSS Statistics 26 software (IBM Corporation, Armonk, USA).

Model evaluation

The model performance evaluation indexes include correction set correlation coefficient (Rc), prediction set correlation coefficient(Rp), correction set root-mean-square error (RMSEC) and prediction set root-mean-square error (RMSEP), and the formulae of each index are as follows:

$${R_c}=\sqrt {\sum\nolimits_{{i=1}}^{n} {{{({y_{ci}} - {y_{mi}})}^2}} } /\sqrt {\sum\nolimits_{{i=1}}^{n} {{{({y_{ci}} - {y_{meanc}})}^2}} }$$
(7)
$${R_p}=\sqrt {\sum\nolimits_{{i=1}}^{n} {{{({y_{pi}} - {y_{mi}})}^2}} } /\sqrt {\sum\nolimits_{{i=1}}^{n} {{{({y_{pi}} - {y_{meanp}})}^2}} }$$
(8)
$$RMSEC=\sqrt {\frac{1}{{{n_c}}}\sum\nolimits_{{i=1}}^{n} {{{({y_{ci}} - {y_{mi}})}^2}} }$$
(9)
$$RMSEP=\sqrt {\frac{1}{{{n_p}}}\sum\nolimits_{{i=1}}^{n} {{{({y_{pi}} - {y_{mi}})}^2}} }$$
(10)

where ymi denotes the ith sample measurement; yci and ypi are denote the ith sample prediction in the calibration and prediction sets; ymeanc and ymeano are the average measurement in the calibration and prediction sets, respectively; and nc and np are the number of samples in the calibration and prediction sets.

Results and discussion

System validation results

The detection accuracy of the SIS system is verified by measuring the optical properties of water, as shown in Fig. 3. In the range of 400 ~ 1050 nm, there is a significant O-H bond absorption peak at 980 nm for the absorption coefficient μa, and two small absorption peaks at 740 nm and 840 nm. In the range of 800–1050 nm, the absorption coefficient is the same as the values measured by Zhang et al.42 and Rowe et al.43 Significant baseline drift was observed in the range of 400–800 nm, which may be due to the loss of direct and diffuse light at the sample edge, resulting in higher absorption coefficient measurements44.The reduced scattering coefficient μ′s exhibits Mie scattering characteristics that decrease with increasing wavelength (Bulletin, 2015). The measured values of μ′s in the range of 400–665 nm have significant errors compared to reference values. The relative error range for μ′s is 0.22–12.29% over the range of 400–1050 nm, with an average relative error of 7.04%, which is lower than the 11.1% error reported by Rowe et al.43 This error may be due to differences in the particle size distribution of different manufacturers’ Intralipid-20% solutions45.

Fig. 3
Fig. 3
Full size image

Comparison of μa and μ′s measured by the SIS system with reference values.

The differences in Orah Mandarin internal qualities

Figure 4 shows the changes in SSC and MC of Orah mandarin pulp during storage. With increasing storage time, the SSC of Wogan initially rises and then falls, from 13.52 °Brix to 12.81 °Brix. This trend can be attributed to the initial high SSC at the start of the experiment, which was due to the “low-temperature saccharification” phenomenon in the Orah mandarins stored under refrigeration. Subsequently, after storage at 25 °C, the “low-temperature saccharification” was reversed, leading to a decline in SSC. Variance analysis revealed significant differences (P < 0.05) between days 0 and 6. During storage, the Orah mandarin MC remained relatively stable, inching up from 84.53 to 85.9%. This outcome may be explained by the negative correlation between SSC and MC: as SSC increases, MC decreases, and vice versa.

Fig. 4
Fig. 4
Full size image

Changes in SSC and MC of Orah mandarin during storage.

Differences in the optical properties of pulp tissue during storage

As shown in Fig. 5, the absorption coefficient μa and the reduced scattering coefficient μ′s of pulp tissue have the same shape and different values at different storage periods, which is consistent with the study by Sun et al.33 In the μa spectrum, the average absorption coefficient ranges from 0.018 to 0.268 mm−1, and there are two absorption peaks in the flesh tissue at 500 nm and 980 nm. The absorption peak at 500 nm is related to pigments46, while the absorption peak at 980 nm is related to water47,48. During storage, the absorption coefficients of adjacent sampling times do not differ significantly. This may be due to the fact that our experimental sample is a mixture of juice and vesicles, which makes it difficult to distinguish the changes in the absorption peak values for water in the absorption coefficient.

Compared to μa, the average reduced scattering coefficient ranges from 0.088 to 0.25 mm-1, and the μ′s spectrum shows a valley at the wavelength corresponding to strong absorption in the μa spectrum (980 nm), and only exhibits a decreasing trend with increasing wavelength from 550 to 1050 nm.When calculating optical properties using the IAD method, the light loss within the integrating sphere is not taken into account, which results in an overestimation of the absorption coefficient and underestimation of the reduced scattering coefficient. As a result, μ′s does not exhibit the characteristic of decreasing with increasing wavelength. During storage, the reduced scattering coefficient shows a significant change at 550 nm and exhibits a decreasing trend overall, which is consistent with the findings of studies by Lu et al.48 and Cen et al.49.

Fig. 5
Fig. 5
Full size image

Absorption coefficient μa and reduced scattering coefficient μ′s of Orah mandarin pulp tissue during storage.

Differences in the optical properties of exocarp and enocarp tissue during storage

As shown in Fig. 6, the absorption coefficient values of Orah mandarin exocarp range from 0.068 to 2.038 mm− 1, while exocarp tissue exhibits three absorption peaks at 500 nm, 680 nm, and 980 nm. The absorption peak at 500 nm is related to the color of the exocarp, the absorption peak at 680 nm is associated with chlorophyll50, and the absorption peak at 980 nm is related to water. Compared to μa, the range of μ′s is from 0.8 to 2.1 mm− 1, and there is a valley at 680 nm in μ′s which may be caused by “crosstalk” in the integrating sphere. During storage, the reduced scattering coefficient changes significantly at 680 nm, and exhibits a decreasing trend with time.

Fig. 6
Fig. 6
Full size image

Absorption coefficient μa and reduced scattering coefficient μ′s of Orah mandarin exocarp during storage.

As shown in Fig. 7, the average absorption coefficient range for Orah mandarin endocarp is 0.022–0.528 mm− 1, with an absorption peak at 980 nm. The absorption peak at 980 nm is related to water. Compared to μa, the average reduced scattering coefficient range is between 4.9 and 8.2 mm−1, and μ′s has valleys at 500 nm and 680 nm. Although μa does not have a significant absorption peak at 680 nm, the reduced scattering coefficient for endocarp shows a similar trend to that of the exocarp at 680 nm. This may be due to “crosstalk” in the integrating sphere; or it may be due to incomplete removal of the yellow transparent part of the exocarp during sample preparation, leading to fluctuations in the endocarp at 680 nm during measurement. During storage, the reduced scattering coefficient at 680 nm showed a significant change and showed a decreasing trend overall.

Fig. 7
Fig. 7
Full size image

Absorption coefficient μa and reduced scattering coefficient μ′s of Orah mandarin endocarp during storage.

Correlation analysis between optical properties and internal quality of Orah Mandarin tissues

We conducted Pearson correlation analysis between the optical properties of pulp tissue and both MC and SSC. The results are depicted in Fig. 8. For SSC, the absorption coefficient (μa ) exhibits a negative correlation with SSC within the 500–1050 nm range, with the strongest negative correlation at 540 nm (correlation coefficient = -0.32), and an average correlation coefficient of -0.23 over the entire wavelength range. The reduced scattering coefficient (μ′s) shows a positive correlation with SSC within the 500–1050 nm range, peaking at 500 nm with a correlation coefficient of 0.24, and an average correlation coefficient of 0.18. The correlations between SSC and both μa and μ′s increase monotonically with wavelength beyond 540 nm, exhibiting fluctuations at 540 nm and 980 nm.

For MC, μa demonstrates a positive correlation with MC in the 500–1050 nm range, with the highest correlation at 500 nm (correlation coefficient = 0.38) and an average correlation coefficient of 0.16, with notable fluctuations at 550 nm and 980 nm, the latter of which is associated with water. The reduced scattering coefficient (μ′s) shows a negative correlation with MC within the 500–1050 nm range, with the highest negative correlation at 500 nm (correlation coefficient = -0.29) and an average correlation coefficient of -0.15, with significant fluctuation at 550 nm. The correlations between MC and both μa and μ′s decrease monotonically with increasing wavelength beyond 550 nm. Additionally, due to the strong negative correlation between SSC and MC, an increase in the correlation between optical properties and MC results in a corresponding negative increase in the correlation between optical properties and SSC.

Fig. 8
Fig. 8
Full size image

Correlation coefficients of SSC and MC of Orah mandarin pulp with μa and μ′s.

Prediction model of SSC and MC based on optical properties

To further investigate the correlation between the optical properties of Orah mandarin pulp tissue and SSC and MC, PLSR models for SSC and MC were established based on the μa spectrum and μ′s spectrum, respectively, utilizing chemometric techniques. Given the low direct correlation between the optical properties and SSC and MC, it is not feasible to develop predictive models without preprocessing the spectra of μa and μ′s. Therefore, various spectral preprocessing methods were applied to μa and μ′s, including Multiplicative Scatter Correction (MSC), Standard Normal Variate (SNV), and Baseline Correction.

Table 3 presents the modeling results for SSC of Orah mandarin based on different optical property parameters and preprocessing methods in the PLSR model. The results indicate that, regardless of the preprocessing method used, the PLSR models based on the μa spectrum achieved the highest correlation coefficients and the lowest RMSE. Among the three preprocessing methods and the unprocessed spectra, Baseline correction showed the best preprocessing performance for the μa spectrum, followed by MSC and SNV. For the μ′s spectrum, SNV provided the best preprocessing performance, followed by MSC and Baseline correction.

In terms of SSC prediction, the calibration set correlation coefficients (Rc) and RMSEC ranged from 0.882 to 0.955 and 0.352 to 0.559, respectively, while the prediction set correlation coefficients (Rp) and RMSEP ranged from 0.842 to 0.921 and 0.549 to 0.754, respectively. Conversely, the models based on the μ′s spectrum performed worse, with calibration set correlation coefficients (Rc) and RMSEC ranging from 0.832 to 0.882 and 0.615 to 0.761, respectively, and prediction set determination coefficients (Rp) and RMSEP ranging from 0.805 to 0.851 and 0.755 to 0.825, respectively. Figure 9 (a) and (b) illustrate the optimal SSC prediction models based on the μa and μ′s spectra, respectively.

Table 3 Prediction results of SSC based on models using different preprocessing methods.

Table 4 presents the modeling results for the MC of Orah mandarin based on different optical property parameters and preprocessing methods in the PLSR model. The results indicate that, regardless of the preprocessing method used, the models based on the μa spectrum achieved the highest correlation coefficients and the lowest RMSE. Among the three preprocessing methods, Baseline correction demonstrated the best preprocessing performance for the μa spectrum, followed by MSC, and finally SNV. For the μ′s spectrum, SNV provided the best preprocessing performance, followed by MSC and Baseline correction.

In terms of moisture content prediction, the calibration set correlation coefficients (Rc) and RMSEC ranged from 0.855 to 0.951 and 0.403 to 0.679, respectively, while the prediction set correlation coefficients (Rp) and RMSEP ranged from 0.798 to 0.906 and 0.636 to 0.903, respectively. Conversely, the models based on the μ′s spectrum performed worse, with calibration set correlation coefficients (Rc) and RMSEC ranging from 0.801 to 0.850 and 0.677 to 0.771, respectively, and prediction set determination coefficients (Rp) and RMSEP ranging from 0.783 to 0.834 and 0.894 to 1.063, respectively. Figure 10(a) and (b) illustrate the optimal moisture content prediction models based on the μa and μ′s spectra, respectively.

Table 4 Prediction results of MC based on models using different preprocessing methods.

In the PLSR models established based on three preprocessing methods, models constructed using the Baseline preprocessing method exhibited optimal performance, consistent with findings in the study by Liu et al.30, where a Baseline-1D-CNN model based on μa demonstrated superior predictive efficacy for SSC. Baseline preprocessing aims to mitigate the influence of instrument background or drift, while MSC and SNV aim to eliminate scattering effects arising from uneven particle distribution or differences in particle size. Given the occurrence of baseline drift during precision validation in the integrating sphere system, Baseline effectively mitigated the impact of baseline drift during measurement by the integrating sphere system, thereby significantly enhancing model performance. Our PLSR model for SSC (Rp = 0.921) outperformed Fang et al.28, who reported R² = 0.833 for apple SSC prediction using absorption coefficients. This highlights the advantage of integrating preprocessing (e.g., Baseline correction) to mitigate baseline drift in citrus tissues. For moisture content (MC), our model (Rp = 0.906) aligns with Liu et al.30, who achieved R² = 0.938 for peach SSC using 1D-CNN but required higher computational costs.

Among the 16 prediction models, whether predicting SSC or MC, models established based on the μa spectrum consistently demonstrated superior performance, indicating that SSC and MC primarily influence the absorption characteristics of Orah mandarin pulp tissue. The absorption coefficient (μa) model performs better than the reduced scattering coefficient (μ′s) model, possibly due to the detection sensitivity of the tissue microstructure and chemical composition. The absorption coefficient is more directly related to the chemical composition of the tissue, for example, the presence of pigment and water, which have a significant effect on the internal mass of the citrus. In contrast, the reduced scattering coefficient is more influenced by the tissue microstructure, which may not have a direct and strong correlation with internal mass parameters such as MC and SSC. These results further elucidate the correlation between optical property parameters and internal quality.

Fig. 9
Fig. 9
Full size image

Scatter plot of SSC prediction model based on μa and μ′s.

Fig. 10
Fig. 10
Full size image

Scatter plot of MC prediction model based on μa and μ′s.

Conclusions

This relationship between optical properties and internal quality of Orah mandarin during storage was studied. Significant differences in internal quality were observed across different storage periods. Using a SIS system combined with the IAD algorithm, we measured the optical properties of various tissues (pulp, endocarp, exdocarp) of Orah mandarins in the 500–1050 nm range during storage. The μa and μ′s of various tissues exhibited consistent waveforms but varied in magnitude. To further explore the relationship between optical properties, SSC and MC, PLSR models for SSC and MC were established based on the μa and μ′s spectra using chemometric techniques. The models based on the μa spectrum demonstrated optimal performance. For SSC prediction, Rp and RMSEP ranged from 0.842 to 0.921 and 0.549 to 0.754, respectively. For MC prediction, Rp and RMSEP ranged from 0.798 to 0.906 and 0.636 to 0.903, respectively. These results suggest that SSC and MC primarily influence the absorption characteristics of Orah mandarins, further elucidating the relationship between optical property parameters and internal quality, and providing a theoretical basis for optical detection technology. This study mainly focused on optical properties and internal quality parameters, lacking detailed exploration of fruit tissue anatomical characteristics. The relatively short six - day storage period only allowed for observation of short - term changes. Additionally, the study used a specific set of samples and conditions. Future research should correlate optical properties with anatomical features like pigment distribution and cell structure to better understand the underlying mechanisms. Extending the storage period could provide more insights into long - term effects on optical properties and internal quality. To enhance the generalizability of the findings, future studies should consider a wider range of samples and storage conditions.