Table 1 Summary of the statistical models

From: Genome-wide characterisation of Hepatitis B mutations involved in clinical outcome

Model

Number of clades retained/total number of clades with bootstrap value>70%

Number of parsimony- informative sites

Number of candidates

AIC

P-value

Percentage of well-predicted cases

Host factors (age+geography)

NA

NA

NA

158.9

5.62 × 10−6

41.9

Host factors (age+sex +age:sex)+polymerase

13/30

280

33

97.9

9.54 × 10−17

70.4

host factors (age+sex +age:sex)+preS1/S2/S

7/21

152

18

127.5

1.57 × 10−11

48.2

Host factors (age)+pc/core

2/8

53

1

148.5

4.87 × 10−8

47.8

Host factors (age+sex +age:sex)+X

4/9

55

2

137.9

1.04 × 10−9

55.8

Host factors (age+sex +age:sex)+Full genome

13/26

540

29

94.6

2.50 × 10−17

72.0

  1. The first column describes the model under consideration. The second column provides the number of clades retained in the model, expressed as a fraction of the total number of clades. The third column gives the number of parsimonious sites (i.e. sites at which there are at least two different kinds of amino acids, with the rarest found in at least two sequences) in the corresponding portion of the genome. The fourth column gives the number of polymorphisms identified as associated to clinical outcome. The last three columns summarise the goodness of fit of the models: the AIC value, the P-value, and the percentage of correctly predicted cases through the cross-validation procedure.