Extended Data Fig. 2: Extent of sharing between immunoglobulin clonotypes belonging to HIP1–HIP3.
From: High frequency of shared clonotypes in human B cell receptor repertoires

a, Normalized frequency histogram of HCDR3 sequence lengths belonging to V3J clonotypes from HIP1+2+3all (blue filled curve, n = 30,156,947 unique HCDR3s, with a median length of 16 amino acids) and HIP1+2+3shared (grey bins, n = 22,934 unique HCDR3s, with a median length of 13 amino acids). Medians were statistically different, based on a two-tailed Mann–Whitney U-test with a P < 2.2 × 10−16 (at an α = 0.05). b, Normalized frequency histogram of HCDR3 lengths belonging to all V3DJ clonotypes from HIP1 (n = 1,750,325 unique HCDR3s, with a median length of 19 amino acids), HIP2 (n = 3,889,527 unique HCDR3s, with a median length of 19 amino acids) and HIP3 (n = 1,437,339 unique HCDR3s, with a median length of 19 amino acids). c, Cumulative distribution of normalized VDJ triple frequencies used for simulation. HIP1, n = 4,371 unique VDJ triples; HIP2, n = 4,346 unique VDJ triples; and HIP3, n = 4,370 unique VDJ triples. d, log–log frequency plot between experimental and synthetic HCDR3 lengths. The Pearson correlation coefficient r = 1.00 with a P < 2.2 × 10−16 (at an α = 0.05) (n = 26 CDR3 length bins for each set). e, Normalized frequency histogram of V3DJ overlap counts between all three synthetic HIP distributions (n = 3,641 common clonotypes between sequenced repertoires). f, V3J clonotypes with the largest numbers of somatic variants. Numbers in parentheses denote counts for the number of unique somatic variants associated with a V3J clonotype for HIP1, HIP2 and HIP3. g, Percentage overlaps for the Igκ V3J clonotypes from the experimentally determined repertoires belonging to HIP1–HIP3. h, Percentage overlaps for Igλ V3J clonotypes from the experimentally determined repertoires belonging to HIP1–HIP3.