Table 1 Genes associated with IPD.

From: Identifying genes associated with invasive disease in S. pneumoniae by applying a machine learning approach to whole genome sequence typing data

Gene

Length (bp)

Best matches, identity (%), e-value, Accession number

Information

hpp1

452

1. phtB, 9E-132, 447/465 (96%), NCBI, AF318954.1

2. phpA, 0, 451/480 (94%), NCBI, AF340221.1

1. PHT proteins (aka BHV) are thought to be involved in the invasion process of pneumococci59,60.

2. The PhpA protein elicits protective immune response against bacteremia and nasopharyngeal carriage in mice59.

hpp2

330

hypothetical protein (CPS), 4E-168, 328/330 (99%), NCBI, JQ653094.1

This is a putative capsular polysaccharide biosynthesis protein, Capsular differences are known to be associated with invasive disease7.

hpp3

249

hypothetical protein

 

hpp4

504

phtD, 0, 504/504 (100%), NCBI, KP127799.1

The found phtD hit was a part of a sequence shown to be highly conserved in invasive isolates61.

hpp5

954

Hypothetical protein (CPS), 0, 954/954 (100%), NCBI, HE651314.1

This is a putative capsular polysaccharide biosynthesis protein. Capsular differences are known to be associated with invasive disease7.

hpp6

996

Hypothetical protein

 

hpp7

231

pspC, 1E-67, 167/179 (93%), NCBI, AF154043.2

pspC was shown to be involved in immune response to bacteremia in mice36.

hpp8

510

Hypothetical protein

 

hpp9

324

Hypothetical protein (CPS), 2E-161, 320/324 (99%), NCBI, ADM91299.1

This is a putative capsular polysaccharide biosynthesis protein. Capsular differences are known to be associated with invasive disease7.

hpp10

504

pspC,0, 502/504 (99%), NCBI, AF154022.1

pspC was shown to be involved in immune response to bacteremia in mice36.

hpp11

306

Hypothetical protein

 

hpp12

399

Hypothetical protein

 

hpp13

327

Hypothetical protein (CPS), 0, 511/528 (97%), NCBI, AF316639.1

This is a putative capsular polysaccharide biosynthesis protein. Capsular differences are known to be associated with invasive disease7.

ydcP_1

471

putative protease YdcP, 0, 470/471(99%), NCBI, AFS43444.1

YdcP is part of the U32 protease family. It is a collagenase, facilitating breaking of extracellular structures tissues, and is a known virulence factor in other bacterial species62.

hpp14

519

Hypothetical protein (CPS), 0 509/519 (98%), NCBI, AF154022.1

This is a putative capsular polysaccharide biosynthesis protein. Capsular differences are known to be associated with invasive disease7.

hpp15 (hmo)

147

L-lactate dehydrogenase (FMN-dependent)-like/alpha-hydroxy acid dehydrogenase, 4E-70, 147/147(100%)

Lactate dehydrogenase was found to be essential enzyme for pneumococcal survival in blood63.

hpp16

480

Hypothetical protein

 

lytB

1977

Putative endo-beta-N-acetylglucosaminidase, 0, 1968/1977 (99%), NCBI, AJ870414.1

lytB codes for a endo-beta-N-acetylglucosaminidase, which is responsible for cell-wall hydrolysis and is thought to be a virulence factor27,28.

hpp17

528

Hypothetical protein (CPS), 511/528 (97%), NCBI, JF301964.1

This is a putative capsular polysaccharide biosynthesis protein. Capsular differences are known to be associated with invasive disease7.

hpp18

516

pspC, 508/516 (98%), NCBI, AF154043.2

pspC was shown to be involved in immune response to bacteremia in mice36.

hpp19

489

Hypothetical protein

 

hpp20

387

Hypothetical protein (partial transposase), 0, 387/387(100%), NCBI, ADM91518.1

Part of the mobile genetic elements of the bacterium.

hpp21

258

Hypothetical protein

 

hpp22

288

Hypothetical protein

 

hpp23

210

Hypothetical protein

 

hpp24

387

Hypothetical protein (partial transposase), 0.0, 386/387(99%), NCBI, CP002176 (positions 1374937–1375323)

Part of the mobile genetic elements of the bacterium.

hpp25

510

Hypothetical protein

 

hpp26

168

Hypothetical protein

 

hpp27

489

pspC, 0, 463/490 (94%), NCBI, AF154022.1

pspC was shown to be involved in immune response to bacteremia in mice36.

lox

1137

Lactate oxidase (lox) gene, 0, 1001/1137(88%), NCBI, DQ984140.3

The lox gene is involved in bacterial niche competition and virulence in streptococci and other bacterial species30,31.

hpp28

840

Sortase (srtA), 0, 614/740 (83%), NCBI, KX147105.1

In Streptococcus mutans, disruption of the sortase (srtA) gene led to decrease in adherence and invasion to endothelial cells64.

hpp29

189

Hypothetical protein

 

hpp30

537

Hypothetical protein

 

hpp31

504

Hypothetical protein

 

hpp32

309

Hypothetical protein

 

hpp33

309

Hypothetical protein

 

cpsA

1446

cpsA (aka wzg), 0, 1446/1446 (100%), NCBI, KC522490.1

wzg (aka cpsA) is part of the capsular polysaccharide synthesis gene locus. High expression of cpsA is associated with bacteremia in humans65.

bgaA

6702

bgaA (Beta-galactosidase BoGH2A), 6466/6704 (96%), NCBI, AF282987.1

bgaA is hypothesized to be a pneumococcal virulence factor66 and was shown to promote resistance to immune cells in human serum67.

cpsA

1446

cpsA (aka wzg), 0, 1446/1446 (100%), NCBI, KC522492.1

wzg (aka cpsA) is part of the capsular polysaccharide synthesis gene locus. High expression of cpsA is associated with bacteremia in humans65.

hpp34

207

Hypothetical protein

 

hpp35

573

pspC, 0, 566/573 (99%), NCBI, AF154043.2

pspC was shown to be involved in immune response to bacteremia in mice36.

hpp36

684

cpsD, 0, 682/684 (99%), NCBI, AFC94091.1

cpsD mutations were shown to inhibit the possibility of causing bacteremia in mice68.

hpp37

840

Hypothetical protein