Fig. 4
From: Bacteroidetes use thousands of enzyme combinations to break down glycans

Number of unique PULs according to the number of genomes analyzed. The number of PULs was calculated by randomly resampling an increasing number of genomes from our data set (x-axis). The resampling was performed ten times; the median value is represented on the y-axis. A second order polynomial regression gives the trend of two sets of values corresponding to 0 (red) and 20% (blue) mismatch used during PUL clustering