Integrative microbiome analysis holds great promise, but combining data sets is notoriously challenging due to batch effects and high heterogeneity. Here, Yuan & Wang present a microbiome data integration method that tackles these challenges using a shared dictionary learning approach, enabling robust integration and preserving biological signals.