Fig. 2: Comparison of HarmonizR to established missing value handling strategies for batch effect reduction. | Nature Communications

Fig. 2: Comparison of HarmonizR to established missing value handling strategies for batch effect reduction.

From: HarmonizR enables data harmonization across independent proteomic datasets with appropriate handling of missing values

Fig. 2

Comparison of imputation (Normal distribution -matrix wise, Normal distribution -column wise, Random Forest) to the HarmonizR framework, as missing value handling strategies for batch effect reduction, on K562 Chronic Myelogenous Leukemia cells spiked with 10% yeast 10% E. coli (phenotype 1) and 5% yeast and 15% E. coli (phenotype 2). Technical triplicates were measured in three different experimental setups (DDA data, acquired on a QExactive mass spectrometer, DIA data acquired on a QExactive mass spectrometer, and SWATH data acquired on a Triple TOF 6600 mass spectrometer). a Schematic overview of executed strategies. b Volcano plot visualization, plotting the log2 foldchange against the −log p-value for t-testing results between phenotype 1 and 2, for individual experiments (Two-sample Student’s T-test, p-value < 0.05). c Heatmap visualization of Pearson correlation-based hierarchical clustering with Ward.D linkage for all executed strategies. d Scatter plot visualization and corresponding correlalso made available through Persetion coefficient for phenotype 1 samples, measured with similar (DDA, upper panels) and different (DDA; SWATH, lower panels) LC-MS/MS setups for all executed strategies. e Volcano plot visualization, plotting the log2 foldchange against the −log p-value for t-testing results between phenotype 1 and 2 for combined data for all executed strategies proteins (Two-sample Student’s T-test, p-value < 0.05). f Number and overlap of p-value significant proteins (Two-sample Student’s T-test, p-value < 0.05), identified in t-testing between phenotype 1 and 2 for individual experiments and combined data for all executed strategies. g Evaluation of the suitability of executed imputation strategies and HarmonizR as missing value handling strategies for batch effect reduction. (−): Criteria not matched; (+) small improvement for respective criterion; (++) Improvement for respective criterion; (+++) Major improvement for respective criterion. Source data are provided as a Source Data file.

Back to article page