Table 1 Overlap of WES and WGS data

From: The sequences of 150,119 genomes in the UK Biobank

 Annotation

WGS

WES

Intersection of WGS and WES

Unique to WES

Present WES (%)

Missing WES (%)

Present WGS (%)

Missing WGS (%)

Coding

6,380,795

5,781,829

5,686,934

94,895

89.29

10.71

98.53

1.47

Splice

445,499

397,226

388,961

8,265

87.54

12.46

98.18

1.82

5′ UTR

2,125,413

590,484

572,996

17,488

27.56

72.44

99.18

0.82

3′ UTR

7,214,427

764,864

743,790

21,074

10.57

89.43

99.71

0.29

Proximal

249,702,570

6,189,465

5,952,145

237,320

2.48

97.52

99.91

0.09

Intergenic

292,259,782

91,836

83,360

8,476

0.03

99.97

More than 99.99

Less than 0.01

  1. Results are computed for the 109,618 samples present in both datasets and are limited to those variants that are present in at least one individual in either dataset. Numbers refer to the number of variants found in the dataset. WGS refers to the GraphTyperHQ dataset and WES refers to a set of 200,000 WES-sequenced indivdiduals59. Missing and present percentages are computed from the number of variants in the union of the two datasets.