Introduction

Diffusion weighted Magnetic resonance imaging (DW-MRI) characterizes the diffusivity of water molecules in-vivo1. There has been an increase in the popularity of IVIM analysis since it characterizes both, the diffusion and perfusion related information without need of any exogenous contrast administration2,3. In IVIM analysis, a bi-exponential (BE) model2,3 is commonly used to estimate quantitative IVIM parameters related to tissue diffusion and perfusion. As restriction of free water diffusion is associated with the tissue microstructure and its architecture, it can be used to characterize tumor and therapeutic changes in tumor microenvironment much before being detectable by the standard qualitative imaging assessments1.

IVIM model have shown promising outcome in oncological evaluations compared to conventional DW-MRI as it provides additional biomarkers related to perfusion4,5. However, clinical utility of IVIM analysis has not reached its full potential because of no standardization of the optimization methods used in quantification of the IVIM parameters due to model non-linearity and to a large extent the low signal to noise ratio of the diffusion MRI5. Therefore, reliable estimation of IVIM parameters is challenged by the conventional IVIM methods such as BE model and its segmented variants, as these employ voxel-wise parameter estimation overlooking the spatial physiological context within a tissue. Studies assessing these methods using clinical datasets of liver6,7,8,9,10,11,12, breast13,14,15, brain16,17, head and neck18,19 etc.20,21,22,23,24 have reported that reproducibility and repeatability of diffusion parameters were satisfactory, however, estimated perfusion parameters were not adequately reliable for any further clinical applications25,26. Bayesian27,28,29 and more recent deep learning30,31 based IVIM methods have shown improved reproducibility and repeatability than the BE and segmented BE methods; however, these also involve individual voxel-wise parameter estimation with a very high computational cost.

There are few studies that have shown to mitigate these issues by introducing spatial homogeneity constraint into the IVIM model. Feirman et al.32, Taimouri et al.33, and Lanzarone et al.34 developed spatially constraint Bayesian-based IVIM methods using spatial homogeneity prior and conditional autoregressive prior respectively. Both research groups reported improved parameter estimation than the standard BE model33,34 and voxel-wise Bayesian-based IVIM method34. However, Bayesian-based IVIM methods are computationally expensive, observed to fail to deal with the high parameter uncertainty in certain tissue areas29 and may be affected by the a priori selection of the prior distribution and central tendency of parameters estimates28. Kim et al.35 proposed diffusion-relaxation correlation spectroscopic imaging that encodes diffusion and relaxation information and estimates multidimensional spatially regularized spectrum to model multiexponential diffusion MR signal. Some limitations of this technique are a priori knowledge of diffusion characteristics of each compartment in tissue microstructure and requirement of substantially more data than conventional methods leading to long data acquisition time and resources that may not be practical for routine clinical practice. Recently, Rauh et al.36 and Finkelstein et al.37 proposed model-based reconstruction of IVIM parameter maps and showed improvement in results than the conventional BE method. As these approaches directly use complex k-space data to estimate parametric maps, reconstruction of individual DW images can be omitted and Rician noise bias can be prevented; however, these techniques require a priori knowledge of spatiotemporal subspace of magnetization dynamics and involve long computation time.

On the other hand, spatial penalty-based IVIM methods that incorporate spatial homogeneity constraint into the optimization routine of the BE model, are computationally less expensive and have shown comparable improvement in parameters estimation38, thus may be an alternative choice. Lin et al.39, incorporated Total difference penalty with the BE model with tuning of several free parameters, prior segmentation and showed an improved parameter estimation in multiple myeloma lesions39. However, it did not report about reproducibility or repeatability of their method. Baidya Kayal et al.38 developed spatial penalty-based IVIM analysis methods (1) Bi-exponential model with adaptive Total Variation (TV) penalty function(BE + TV) and (2) Bi-exponential model with adaptive Huber penalty (HPF) function(BE + HPF). BE + TV and BE + HPF methods38 are highly adaptive in nature, and simple using only regularizing parameter, and do not involve any prior estimates, extra free parameters, or hyper-parameters, which make these methods more robust. These methods showed both qualitatively and quantitatively improved IVIM parameters, especially for the perfusion related parameters, than the alternative conventional methods BE, and segmented BE methods; using simulations as well as clinical data of osteosarcoma38,40,41,42, Ewing sarcoma43, lymphoma44, brain tumor45 and prostate46. IVIM analysis by spatially constrained BE + TV and BE + HPF methods has been reported to be informative for characterizing and monitoring treatment response in various tumors and may be promising for cancer imaging38,40,41,43,44,45,46. However, reproducibility of these two methods needs to be evaluated before its wider adoption in the clinical routine. Therefore, objective of this study was to assess these spatially constrained IVIM analysis methods (BE + TV and BE + HPF) in terms of precision and reproducibility in comparison to the conventional methods for IVIM parameters estimation using clinical dataset of osteosarcoma.

Materials and methods

Patient population

A total of 55 patients with biopsy proven osteosarcoma were recruited prospectively under the institutional ethics committee approved protocol in All India Institute of Medical Sciences New Delhi, India. Complete study protocol and all procedures were explained, and written consents were obtained from all patients before participation. All research was performed in accordance with relevant guidelines and regulations. All patients were planned for neoadjuvant chemotherapy (3 cycles of Cisplatin and Doxorubicin at every 3 weeks) followed by surgery. Total forty (N = 40; Age = 17.7 ± 5.9years; Male:Female = 30:10) patients who completed chemotherapy regimen were analyzed for this study. Remaining dataset of fifteen patients were excluded due to death (n = 8), early surgery (n = 3) or drop-out (n = 4) during chemotherapy. Patients underwent MRI for evaluation of the primary tumor site and chest CT and bone scans for metastatic follow-up.

MRI acquisition

MRI acquisition was performed at three time-points (i) baseline : t0; (ii) after 1st cycle of chemotherapy : t1 (15–20 days after t0); and (iii) after 3rd cycle of chemotherapy : t2 (70–75 days after t0). MRI acquisition was performed using a 1.5 T Philips Achieva® MR scanner with phased-array surface coil or an extremity coil. MRI acquisition protocol included T1-weighted (T1W), T2-weighted (T2W) and IVIM DW-MRI sequences and is elaborated in Table 1. A standard MRI protocol was followed, and images were acquired in the three axial, sagittal, and coronal planes. IVIM DW-MRI was acquired at all three time-points whereas T1W and T2W images were acquired at time-points t0 and t2.

Table 1 MRI acquisition protocol.

Volume of interest localization for tumor and healthy tissue volume

Tumor volume at three time-points were delineated using volume of interest (VOI) drawn manually by an expert radiologist (D.K.; > 14 years of experience in cancer imaging) covering the whole tumor using b = 800 s/mm2 DWI images with reference to the morphological T1W and T2W images. For delineating healthy tissue volume, VOI of equal shape and size (10 × 10 mm2 across 5 slices) was placed at equivalent positions in health muscle tissue (at a safe distance from the tumor) at all the three time-points.

Quantitative parameter estimation

MRI datasets were analyzed at all three time-points using five analysis methods for the comparative analysis; three conventional methodologies; (1) BE method with three-parameter fitting (BE), (2) Segmented BE with two-parameter fitting (BESeg-2), (3) Segmented BE with one-parameter fitting (BESeg-1) and two penalty-based methodologies; (4) BE with adaptive Total-Variation penalty (BE + TV) and (5) BE with adaptive Huber penalty (BE + HPF). Five IVIM analysis methods are briefly described in Appendix. Nonlinear Least Square (NNLS) optimization using Trust Region-based algorithm was used for all the methods. An initial value of [0.001], [0.01] and [0.3] were used by all the five analysis methods to evaluate diffusion coefficient (D), perfusion coefficient (D*) & perfusion fraction (f) parameters respectively. Apparent diffusion coefficient (ADC) was also estimated using mono-exponential (ME) model with b-values ≥ 200 s/mm2 for completion. From estimated parametric maps of ADC, D, D* and f, parameter values in whole tumor and healthy tissue volume were extracted using VOIs at time-points t0, t1 and t2 for each patient and analyzed further. Quantitative IVIM analysis was performed using an in-house built toolbox for all the five analysis methods in MATLAB® (MathWorks Inc., v2017, Philadelphia, USA).

Statistical analysis

Assessment of goodness of fit

The coefficient of determination (R2) was calculated measuring the goodness of fit using sum of squared error method defined in (1)

$${R}^{2}=1-\frac{{\sum }_{b}{\sum }_{i=1}^{n}{\left({E}_{b,i}-{O}_{b,i}\right)}^{2}}{{\sum }_{b}{\sum }_{i=1}^{n}{\left({E}_{b,i}-{\overline{E} }_{i}\right)}^{2}},$$
(1)

where b is number of b-values in IVIM-DWI acquisition, n is number of voxels in volume of interest, \({O}_{b,i}\) is IVIM-DWI signal after reverse fitting, \({E}_{b,i}\) is actual IVIM-DWI signal and \({\overline{E} }_{i}\) is average IVIM-DWI signal for all b-values and 0 ≤ R2 ≤ 1, while, R2 ≈ 0 indicates a poor estimation and R2 ≈ 1 represent a good agreement between the true and the fitted signal. R2 was evaluated for ME and five IVIM analysis methods in whole tumor and healthy tissue volume at three time-points. Data fitting curves in tumor and healthy tissue ROIs of size 5 × 5 voxels were generated for a representative patient at three time-points for qualitative assessment of all the IVIM analysis methods.

Assessment of precision and reproducibility

Precision and reproducibility of IVIM analysis methods were assessed using within-subject coefficient of variation (wCV) and between-subject coefficient of variation (bCV) as in (2) and (3) respectively as were reported by Refs.7,9,13,33.

$$wCV=100\times \frac{\sigma ({p}_{1}, {p}_{2}\dots \dots \dots {p}_{n })}{\mu ({p}_{1}, {p}_{2}\dots \dots \dots {p}_{n })},$$
(2)

where σ is standard deviation, µ is mean and p stands for any of the quantitative parameter (ADC, D, D* & f) values at n number of voxels in the volume of interest.

$$bCV=100\times \frac{\mu \left({\sigma }_{1}\left({p}_{1}, {p}_{2}\dots \dots \dots {p}_{n }\right)+ {\sigma }_{2}\left({p}_{1}, {p}_{2}\dots \dots \dots {p}_{n }\right)\dots \dots .+{\sigma }_{N}\left({p}_{1}, {p}_{2}\dots \dots \dots {p}_{n }\right)\right)}{\mu ({\mu }_{1}\left({p}_{1}, {p}_{2}\dots \dots \dots {p}_{n }\right)+ {\mu }_{2}\left({p}_{1}, {p}_{2}\dots \dots \dots {p}_{n }\right)\dots \dots .+{\mu }_{N}\left({p}_{1}, {p}_{2}\dots \dots \dots {p}_{n }\right)},$$
(3)

where σ is standard deviation, µ is mean, p stands for any of the quantitative parameter (ADC, D, D* & f) values at n number of voxels in the volume of interest and N is total number of subjects.

The lower the value of wCV and bCV, the better the precision and reproducibility respectively. All quantitative values were tested for normality before comparison. Absolute mean and wCV values of IVIM parameters (D, D* and f) were estimated in tumor and healthy tissue volume using five IVIM methods for all patients across three time-points. Intra-scan means and wCV of each quantitative parameters were compared between five IVIM analysis methods using the Friedman test in tumor and healthy tissue volume separately at each time-point for statistical significance (p < 0.05). Post-hoc analysis with Wilcoxon signed ranks (WSR) tests was carried out with a Bonferroni correction applied for multiple comparison (10 comparisons), which set the significance level at p < 0.05/10 = 0.005.

To observe the repeatability of the penalty based IVIM methodologies, VOI of healthy muscle tissue at a safe distance from tumor across the three time-points were selected as the internal control and inter-scan comparison of mean and wCV values of the IVIM parameters in healthy muscle were performed across three time-points. Bland–Altman limits of agreement was applied to test the scan-rescan repeatability of the five IVIM analysis methods by comparing estimated IVIM parameters in healthy muscle across time-points. Inter-scan comparisons of mean IVIM parameters (D, D* and f) in whole tumor were also performed for completion (Detailed descriptions of inter-scan comparisons in healthy muscle and tumor are in Supplementary Materials).

Statistical analyses were performed using SPSS v16.0 software (IBM Corporation) and MATLAB® (MathWorks Inc., v2017, Philadelphia, USA).

Results

Quantitative parameter estimation in tumor and healthy tissue

Table 2 shows the mean ADC and IVIM parameters along with corresponding R2 values at three time-points in tumor and healthy tissue volume. D, D* and f values estimated using five IVIM analysis methods were consistently statistically different (Friedman test; p < 10–3) in both tumor and healthy muscle. However, post-hoc WSR test revealed that D was not statically different (p > 0.005) among all methods except BE, while D* and f for BE + TV and BE + HPF were significantly different (p < 0.005) compared to other methods at all the time-points. BE + TV & BE + HPF methods showed no significant differences (p > 0.005) in all the parameter estimation. R2 values for data fitting in tumor and heathy tissue were in the ranges 0.89–0.90; 0.90–0.92; 0.90–0.92; 0.90–0.93; and 0.90–0.93 for BE, BEseg-2, BEseg-1, BE + TV and BE + HPF methods respectively.

Table 2 Average apparent diffusion coefficient (ADC), diffusion coefficient (D), perfusion coefficient (D*) and perfusion fraction (f) in tumor volume and healthy tissue volume evaluated by mono-exponential (ME) method and five IVIM analysis methods (1) BE method with three-parameter fitting (BE), (2) segmented BE with two-parameter fitting (BESeg-2), (3) segmented BE with one-parameter fitting (BESeg-1), (4) BE with adaptive Total-Variation penalty (BE + TV) and (5) BE with adaptive Huber penalty (BE + HPF) at three time-points.

Figure 1 shows the point plots for the average ADC and IVIM parameter values evaluated by five analysis methods at three time-points in tumor and healthy tissue. Across time-points, significant increase in D values in tumor were observed for all methods, while D values were consistent in healthy muscle. Only BE + TV and BE + HPF methods showed significant reduction in D* values in tumor across time-points, whereas D* values were consistent in healthy muscle. All methods did not show any significant changes in f values in tumor; however, BESeg-2 and BESeg-1 showed significant changes in f values in healthy muscle tissue across time-points. Inter-subject variation for estimated D* and f were higher using BE, BESeg-2 and BESeg-1 methods compared to the penalty-based BE + TV and BE + HPF methods.

Fig. 1
figure 1

Point plots for average (A) Apparent diffusion coefficient (ADC) evaluated by mono-exponential (ME) method and IVIM parameters (B) Diffusion coefficient (D), (C) perfusion coefficient (D*) and (D) perfusion fraction (f) in patient cohort evaluated by five IVIM analysis methods (1) Bi-exponential (BE) method with three-parameter fitting (BE), (2) segmented BE with two-parameter fitting (BESeg-2), (3) Segmented BE with one-parameter fitting (BESeg-1), (4) BE with adaptive Total-Variation penalty (BE + TV) and (5) BE with adaptive Huber penalty (BE + HPF) at three time-points (t0, t1 and t2) in tumor and healthy tissue volume.

Figure 2 shows the IVIM parametric maps by all five analysis methods for a representative patient at the time-points t0. For estimating D* and f, BE + TV and BE + HPF both showed low spurious noise and better image quality and overall improved interpretability of anatomical regions in parametric maps, whereas, BE, BESeg-2 and BESeg-1 methods, showed comparatively over estimation with large variability and noise in D* maps (yellow arrows) and for f maps BE and BESeg-2 showed overestimation while BESeg-1 showed underestimation (white arrows). Qualitatively estimation of D was comparably good for all the four methods except BE method (red arrows).

Fig. 2
figure 2

(a) DWI (b = 800 s/mm2); (b) DWI with ROIs for tumor (red overlay) and healthy tissue (blue overlay); (c) apparent diffusion coefficient (ADC), of a representative patient (Male, 15 years) with osteosarcoma in right femur at time-points t0. IVIM parametric maps estimated by five IVIM analysis methodologies (1) Bi-exponential (BE) method with three-parameter fitting (BE), (2) Segmented BE with two-parameter fitting (BESeg-2), (3) Segmented BE with one-parameter fitting (BESeg-1), (4) BE with adaptive Total-Variation penalty (BE + TV) and (5) BE with adaptive Huber penalty (BE + HPF) at time-points t0 are depicted in (d–h) Diffusion coefficient (D), (i–m) Perfusion coefficient (D*) and (n–r) Perfusion fraction (f) respectively. Red, yellow and white arrows indicate the tumor regions in D, D* and f maps respectively.

IVIM parametric maps across three time-points (for the same patient in Fig. 2) are shown in Supplementary Fig. 1. Across the time-points during treatment, in tumor region qualitatively an increase in D values can be observed from t0 to t1 and t2 time-points (red arrows) for all the methods; however, an increase in D* values after 1st chemotherapy cycle (time-point t1) and reduction of D* after completion of chemotherapy (time-point t2) can be appreciated only for BE + TV and BE + HPF methods (yellow arrows), while f values in tumor did not show changes across time-points (white arrows). In healthy muscle region, qualitatively IVIM parameters were consistent for BE + TV and BE + HPF methods across time-points (pink arrows).

Figure 3 demonstrates the IVIM DWI data fitting performed by all five analysis methods in tumor and healthy muscle ROIs from the representative patient same as Fig. 2. Qualitatively, fitting curves for BE, BEseg-2 and BEseg-1 were comparatively suboptimal than the BE + TV and BE + HPF methods in both tumor and healthy tissue at the three time-points. Signal attenuation curve for healthy tissue was relatively flat than the tumor at lower b-values (0–100 s/mm2) indicating low perfusion in healthy tissue.

Fig. 3
figure 3

Data fitting curves in tumor and healthy tissue ROIs (same DWI MRI slice as in Fig. 2. by five IVIM analysis methods (1) Bi-exponential (BE) method with three-parameter fitting (BE), (2) segmented BE with two-parameter fitting (BESeg-2), (3) segmented BE with one-parameter fitting (BESeg-1), (4) BE with adaptive Total-Variation penalty (BE + TV) and (5) BE with adaptive Huber penalty (BE + HPF) for a representative patient (Male, 15 years) with osteosarcoma in right femur at three time-points—A: Baseline (t0); B: 1st Follow-up (t1); and C: 2nd Follow-up (t2). For A and C: (i) T2W fat-saturated MRI; (ii) DWI (b = 800 s/mm2) with ROIs for tumor (red outline) and healthy tissue (blue outline); (iii) Data fitting in tumor and (iv) Data fitting in healthy tissue by five IVIM analysis methods. For B: (i) DWI (b = 800 s/mm2) with ROIs for tumor (red outline) and healthy tissue (blue outline); (ii) Data fitting in tumor and (iii) Data fitting in healthy tissue by five IVIM analysis methods. In all the plots, along X-axis: b-values (0–800 s/mm2) and along Y-axis: relative signal intensity. Fitting curves in the range, b-value = 0–100 s/mm2 are enlarged in the inset.

Assessment of reproducibility

Table 3 presents the average wCV and bCV values in tumor and healthy tissue volume for five IVIM analysis methodologies at three time-points. wCV values for D, D* and f were statistically different (Friedman test; p < 10–3) between five IVIM analysis methods in both tumor and healthy tissue at three time-points. BE + TV and BE + HPF methods showed significantly (p < 10–3, WSR test) lower wCV in estimating D (24–32%) in tumor and healthy tissue volume than BE method (38–49%); however, were comparable (p > 0.5, WSR test) to BEseg-2 and BEseg-1 methods (23–32%). BE + TV and BE + HPF methods consistently showed significantly (p < 10–3, WSR test) lower wCV in estimating D* (89–108%) and f (55–60%) in both tumor and healthy tissue volumes than all the other methods (D*-wCV: 102–122% and f-wCV: 96–130%) across the three time-points. Figure 4 shows violin plots for the wCV values of ADC and IVIM parameters in tumor and healthy muscle in patient cohort. Intra-subject variations in estimated D were higher for BE method than the other four methods. While intra-subject variations in estimated D* and f were higher using BE, BESeg-2 and BESeg-1 methods compared to the penalty-based BE + BT and BE + HPF methods.

Table 3 Within-subject coefficient of variation (wCV) and between-subject coefficient of variation (bCV) of apparent diffusion coefficient (ADC) and IVIM parameters diffusion coefficient (D), perfusion coefficient (D*) and perfusion fraction (f) in tumor volume and healthy tissue volume evaluated by Mono-exponential (ME) method and five IVIM analysis methodologies (1) BE method with three-parameter fitting (BE), (2) Segmented BE with two-parameter fitting (BESeg-2), (3) Segmented BE with one-parameter fitting (BESeg-1), (4) BE with adaptive Total-Variation penalty (BE + TV) and (5) BE with adaptive Huber penalty (BE + HPF) at three time-points.
Fig. 4
figure 4

Violin plots of within-subject coefficient of variation (wCV) values for apparent diffusion coefficient (ADC) and IVIM parameters diffusion coefficient (D), perfusion coefficient (D*) and perfusion fraction (f) in patient cohort evaluated by mono-exponential (ME) method and five IVIM analysis methods (1) Bi-exponential (BE) method with three-parameter fitting (BE), (2) Segmented BE with two-parameter fitting (BESeg-2), (3) Segmented BE with one-parameter fitting (BESeg-1), (4) BE with adaptive Total-Variation penalty (BE + TV) and (5) BE with adaptive Huber penalty (BE + HPF) at three time-points in tumor and healthy muscle tissue volume.

BE + TV and BE + HPF both demonstrated overall lower variation for estimating D* and f in tumor (D*-bCV: 83–102%; f-bCV: 56–60%) and healthy muscle (D*-bCV: 91–98%; f-bCV: 56–58%) than the other methods (D*-bCV: 98–114%; f-bCV: 94–125%). Variations of estimated D by BE + TV and BE + HPF (D-bCV: 27–31% in tumor; D-bCV: 22–25% in muscle) were lower than BE (D-bCV: 40–46% in tumor; D-bCV: 36–43% in muscle), however, were comparable with BEseg-2 and BEseg-1 methods (D-bCV: 26–31% in tumor; D-bCV: 22–25% in muscle).

Detailed results of inter-scan comparisons in healthy muscle and tumor are in Supplementary Materials. Inter-scan comparisons in healthy muscle demonstrated that mean and wCV values of IVIM parameters estimated by BE + TV and BE + HPF methods were not significantly different (p > 0.1) in healthy muscle across three time-points (Supplementary Table 1). Using Bland–Altman agreement BE + TV and BE + HPF demonstrated satisfactory agreement for estimating D, D* and f in healthy muscle across time-points and showed lower bias and narrower limits-of-agreement than the other three methods for estimating D* & f and compared to BE method for estimating D (Supplementary Table 2). Bland–Altman agreement for all five methods for estimated D, D* and f parameters in healthy muscle from timepoints t0 to t1 are presented in Fig. 5 and from time-points t0 to t2 and timepoints t1 to t2 are shown in Supplementary Figs. 2 and 3 respectively. Inter-scan comparisons in tumor revealed a significant increase in mean D (p < 0.002) and reduction in mean D* (p < 0.006) values in tumor after chemotherapy for BE + TV and BE + HPF methods. No significant changes in f values in tumor were observed across time-points.

Fig. 5
figure 5

Bland–Altman plots showing inter-scan agreement of Diffusion coefficient (D) (1st column), Perfusion coefficient (D*) (2nd column), and Perfusion fraction (f) (3rd column) between time-points t0 and t1 estimated by five IVIM analysis methods (1) Bi-exponential (BE) method with three-parameter fitting (BE) (a–c respectively), (2) Segmented BE with two-parameter fitting (BESeg-2) (d–f respectively), (3) Segmented BE with one-parameter fitting (BESeg-1) (g–i respectively), (4) BE with adaptive Total-Variation penalty (BE + TV) (k–l respectively) and (5) BE with adaptive Huber penalty (BE + HPF) (m–o respectively) in healthy tissue volume.

Discussion

Quantitative IVIM analysis is sensitive to both the tissue diffusion and perfusion parameters simultaneously without any external administration of the contrast agent. It is a safer and very useful imaging modality for cancer assessment and response evaluations3,4,5,40,41,42. However, spurious noise in estimations of perfusion related IVIM parameters through conventional analysis methods has prevented this promising technique to be effectively applied in routine clinical practice5,25,26. IVIM analysis methods with spatial homogeneity constraints demonstrated to mitigate this issue by producing qualitatively and quantitatively improved parameter estimation with a reasonably higher quality than the conventional individual voxel based analysis methods32,33,34,38,39,43,45,46.

There are studies that developed spatially constrained IVIM analysis methods based on Bayesian approach32,33,34, spatial penalty approach38,39 and model-based IVIM reconstruction36,37 to improve parameters estimation and reported a better precision and reproducibility of the methods than the conventional IVIM methods that apply voxel-wise data fitting. The Bayesian approach for IVIM parameter estimation employes a priori distribution for each parameter and maximizes the posterior probability associated with observed DW MRI signal. A spatial homogeneity prior is incorporated into Bayesian based IVIM method by using a continuous Markov random field model32,33 or a mixture prior of Conditional Autoregressive specification with an independent truncated Gaussian prior distribution34 to get the posterior density of the parameters. In both the cases a parameter weighting matrix is defined to specify the scaling factor to the parameters in the employed neighborhood system. While the former method proposed a binary graph cut based fusion bootstrap technique32,33; later approach followed Hamiltonian Monte Carlo algorithm34 to maximize the posterior probability for the IVIM parameters. Therefore, spatially regularized Bayesian based IVIM methods involve complex algorithms, vast a priori distribution and hyperparameters settings that demand high computational time. Moreover, these methods have been also observed to fail to deal with the high parameter uncertainty in certain tissue areas29 and may be affected by the a priori selection of the prior distribution and central tendency of parameters estimates28. In model-based IVIM reconstruction methods, IVIM parameters are directly estimated from raw k-space data using an iterative nonlinear optimization with regularizing term reflecting a priori knowledge on the unknown parameters36,37. These methods are computationally expensive and may depend on many vendors’ specific settings like complex k-space data, coil sensitivity profiles, signal averaging techniques, and various acquisition hardware configurations. Therefore, the generalizability of these methods needs to be tested across different vendors in more extensive studies. On the other hand, spatial penalty-based BE + TV and BE + TV methods are simple, adaptive and computationally less expensive, because both methods involve NNLS optimization along with adaptive spatial homogeneity penalty with a single regularizing parameter to control the amount of spatial homogeneity enforced. TV/HPF penalty functions are image gradient based denoising methods that reduce noise present in flat regions and simultaneously preserve finer details and edges in the images and are advantageous over the postprocessing denoising techniques. In BE + TV and BE + HPE methods NNLS optimization is performed for data fitting and the desired spatial homogeneity and spurious noise reduction in the parametric images were achieved by updating the parametric images iteratively with the adaptive TV/HPF penalty, at each iteration of NNLS optimization. Thus, NNLS error and TV penalty reduction were adaptively and simultaneously balanced as solution progressed to the reliable direction reducing non-physiological spatial inhomogeneity and noise38. To our knowledge, this is the first study to demonstrate the precision and reproducibility of the spatially constrained penalty based IVIM methods using clinical data in oncology. In this study, estimated wCV and bCV for both spatially constrained penalty-based BE + TV and BE + HPF methods were comparable with the spatially constrained Bayesian-based IVIM methods in healthy liver and spleen by Freiman et al.32 and Taimouri et al.33 and in head-and-neck and rectal cancer by Lanzarone et al.34. Compared to the model-based reconstruction methods, spatial penalty-based BE + TV and BE + HPF methods showed mean IVIM parameters in healthy muscle similar with the estimated parameters in skeletal calf muscle36, and lower bias for estimated perfusion related parameters in brain37. Apart from IVIM methods, Yoo et al. applied spatial regularization by incorporating non-local means algorithm to improve myeline water function measurement47. The reported CV values were comparable with the wCV of D* values estimated by BE + TV and BE + HPF methods, however, cannot be directly compared with the current study.

In this study both the spatially constrained (BE + TV and BE + HPF) methods consistently showed improved precision and reproducibility across the time-points with lower variations in both perfusion parameters across the cohort in osteosarcoma as well as healthy muscle in comparison to the BE and segmented BE methods. Variation in estimated D in healthy tissue cohort was lower than the tumor tissue cohort as tumors from different subjects were expected to exhibit variations in diffusion characteristics. However, bCV for perfusion parameters (D*, f) were higher in healthy muscle compared to the tumor. Possible reason may be low perfusion in healthy tissue compared with the highly perfused osteosarcoma making the optimization process challenging, same as reported by Refs.13,16. Higher variations in D* in tumor were observed in follow-up than the baseline possibly as microvascular environment of tumor shows higher variability due to antiangiogenic chemotherapy40,41,42. For the precision and reproducibility in clinical settings, studies using conventional BE and segmented BE method showed comparatively higher bCV and wCV values in liver12, breast15, orbital lesion20, rectal cancer21 and ovarian cancer23 indicating lower reproducibility for D* and f than the current study. On the other hand, BE and BESeg-1 method showed lower wCV or bCV values in healthy liver parenchyma7,8,11, breast13, healthy brain16, cerebrovascular diseases17 and pediatric solid tumor in head24 than the current study. As the different types of tumors and anatomies exhibit a variety of tissue microstructure and heterogeneity, it is hard to directly compare the IVIM analysis methods reported using different pathologies in terms of their precision and reproducibility.

In this study, a comparatively improved inter-scan repeatability of BE + TV and BE + HPF methods was also observed with lower bias and narrower limits of agreement than the conventional IVIM methods, especially for perfusion related parameters (Fig. 5, Supplementary Table 2). Particularly, while no significant changes in estimated f values were observed in tumor across time-points by any of the IVIM methods, BEseg-2 and BEseg-1 methods showed significant changes in f estimations in healthy muscle across time-points (Supplementary Table 1). On the other hand, BE method showed significant differences in estimating D in healthy muscle tissue after chemotherapy, while other four IVIM methods showed consistent D values in healthy muscle across time-points (Supplementary Table 1). Current findings showed that the repeatability of diffusion parameter was better than the perfusion related parameters which is consistent with the previous studies6,12,19.

One of the objectives of IVIM analysis in cancer is therapeutic response assessment. All IVIM analysis methods showed expected increase in ADC and D values in tumor as an effect of neoadjuvant chemotherapy. However, only BE + TV and BE + HPF were observed to be sensitive to change in perfusion parameters than the conventional methods showing significant reduction of D* in tumor after chemotherapy. Therefore, penalty-based BE + TV and BE + HPF methods were showed to be sensitive to the changes in both diffusion and perfusion parameters than the conventional IVIM methods proving its effectiveness for monitoring and evaluating therapeutic response in osteosarcoma, or potentially for the other tumors to be further analysed. Response assessment using spatial penalty-based IVIM analysis is beyond the scope of this paper and were reported in Refs.40,42 and a comparison with conventional IVIM methods for predicting chemotherapy response in bone tumor is reported in Ref.48. Spatial penalty-based methods were consistent across the time-points with lower wCV and bCV and might be able to evaluate reasonably well in treatment response assessment, which needs to be further studied. In clinical context, qualitative rating of IVIM parametric maps by experienced radiologists showed comparative high noise suppression and overall higher image quality and interpretability in perfusion parameters (D* and f) estimated by BE + TV and BE + HPF methods than the BE and segmented BE methods49. All the IVIM datasets used in this study were acquired on a 1.5T scanner. Spatial penalty TV bases IVIM method in the past showed a similarly improved IVIM parameter estimation using different combination of b-values46 and datasets acquired on different scanners50 (1.5 and 3T; Achieva, Philips Healthcare, Best, the Netherlands). This implies that a IVIM protocol using the spatial penalty-based analysis method has generalizability for variations in acquisition parameters and applicability for a widely installed base of MR systems globally.

The current study has a few limitations. Firstly, the clinical datasets were limited; the actual clinical impact of the spatial penalty-based methods with a larger and well-defined clinical patient population may be performed in future. Secondly, the clinical cohort with osteosarcoma is heterogeneous involving different histological subtypes and degrees of angiogenesis thus variability would be expected. Moreover, as tumors exhibit variable responses after antiangiogenic chemotherapy, an increased variability especially in the perfusion related parameters would be expected. Therefore, it would have been meaningful to test the repeatability of the methods, if scan-rescan data on the same day had been acquired for healthy subjects. However, this could not be performed due to limited resources. Healthy muscle tissue at a safer distance from tumor in patients’ datasets were used as internal control for comparison, however, there could be chemotherapy related changes in the healthy tissue which could not be ruled out. Thirdly, commonly used BE method and its segmented variants have been compared with the spatial penalty-based methods; however, there may be other IVIM analysis method like Bayesian-based10,27,28,29, model-based IVIM reconstruction36,37, neural network18, deep learning30,31 and stretched exponential23,24 based methods that could not be assessed and compared. Fourthly, in the current study widely used biexponential IVIM model was applied assuming isotropic microvasculature; however, skeletal muscle exhibits anisotropy that can be quantifiable as shown by Karampinos et al.51, was not considered in this study. Fifthly, mean and CV values of quantitative parameters in tumor and healthy muscle were compared among IVIM analysis methods. This may not be robust because of the partial volume effect due to the resolution used with DWI. Lastly, validation and standardization of the penalty-based methods require a larger clinical study that can compare various analysis methodologies and reach a consensus over a multicentric, multi-scanner clinical validation.

Conclusion

Bi-exponential model with spatial regularization penalty function produced improvement in estimation of both diffusion and perfusion components as compared to other commonly used IVIM analysis methods. Spatial penalty based IVIM analysis methods demonstrated lower variability in parameter estimation proving its precision and reproducibility in clinical scenario.