Table 1 Distribution of Mean of total number of ROH (NROH) and total sums of ROH (SROH) per individual in 3.5KJPNv2 dataset after implementing minimal ROH length threshold of 100 Kb, and 1.5 Mb, respectively

From: Profiling of runs of homozygosity from whole-genome sequence data in Japanese biobank

Tools

Regions and parameter options

ROH metrics

Mean values for different minimal ROH lengths

>100 Kb

>1.5 Mb

BCFtools

All variant sites

NROH

2190

8.37

SROH

582,285,742

23,464,275

Only SNP array-based sites*a

NROH

194

10.16

SROH

127,461,992

29,879,304

PLINK

All variant sites/Het_1

NROH

2739

4.68

SROH

594,581,242

12,701,591

All variant sites/Het_2

NROH

3083

6.59

SROH

720,861,960

17,901,917

All variant sites/Het_3

NROH

3302

8.53

SROH

800,370,778

22,679,954

All variant sites/Het_4

NROH

3471

10.06

SROH

858,847,627

26,260,213

Only SNP array-based sites/Het_1*a

NROH

697

10.94

SROH

318,169,943

30,192,850

  1. PLINK “--homozyg-window-het” option values were set to a range of 1 to 4, i.e., allowing from one to four heterozygous calls per window. These are abbreviated as “Het_1”, “Het_2”, “Het_3”, and “Het_4”, respectively
  2. *a Although the SNP array data is unavailable, we have effectively trimmed whole genome sequencing (WGS) data to restrict our analysis exclusively on the regions specified by OmniExpressExome Array