Table 2 The effects of sample size upon clustering by STRUCTURE for three different evolutionary histories: 100/400/800, 100/600/800 and 100/700/800 (Figure)

From: The computer program STRUCTURE does not reliably identify the main genetic clusters within species: simulations and implications for human population structure

 

N D

  

25

50

100

100/400/800

 

25

1.00, 1.00, 0.66, 0.00

1.00, 1.00, 0.82, 0.00

1.00, 1.00, 0.85, 0.00

N C

50

1.00, 1.00, 0.33, 0.00

1.00, 1.00, 0.99, 0.00

1.00, 1.00, 0.98, 0.00

 

100

1.00, 1.00, 0.00, 0.93

1.00, 1.00, 1.00, 0.00

1.00, 1.00, 1.00, 0.00

100/600/800

 

25

1.00, 1.00, 0.24, 0.00

1.00, 1.00, 0.56, 0.00

1.00, 1.00, 0.72, 0.00

N C

50

1.00, 1.00, 0.00, 0.01

1.00, 1.00, 0.35, 0.00

1.00, 1.00, 0.85, 0.00

 

100

1.00, 1.00, 0.00, 0.64

1.00, 1.00, 0.00, 0.99

1.00, 1.00, 1.00, 0.00

100/700/800

 

25

1.00, 1.00, 0.00, 0.00

1.00, 1.00, 0.39, 0.00

1.00, 1.00, 0.65, 0.00

N C

50

1.00, 1.00, 0.00, 0.21

1.00, 1.00, 0.00, 0.00

1.00, 1.00, 0.80, 0.00

 

100

1.00, 1.00, 0.00, 0.65

1.00, 1.00, 0.00, 0.83

1.00, 1.00, 1.00, 0.00

  1. NC and ND are the samples sizes for populations C and D, respectively. Populations A and B had a sample size of 50 individuals for all of simulations. The data within the table shows the proportions of genetic ancestry within samples from populations A, B, C and D that was assigned to each of two clusters. The evolutionarily appropriate clustering arrangement is 1.00, 1.00, 1.00, 0.00, which indicates that all the genes from populations A, B and C were assigned to one cluster, and all the genes from population D were assigned to the second cluster.