Table 2 Characteristics of patients with medical oncologist notes for analysis.

From: Artificial intelligence-aided clinical annotation of a large multi-cancer genomic dataset

 

Total number of patients and medical oncologist notes

Number of patients with unlabeled medical oncology notes and number of unlabeled medical oncology notes

Number of patients with labeled medical oncology notes and number of labeled medical oncology notes

Patients N (%)

Reports N (%)

Patients N (%)

Reports N (%)

Patients N (%)

Reports N (%)

Total cohort

13511 (100)

232575 (100)

10764 (100)

200264 (100)

2747 (100)

32311 (100)

Sex

  Male

5561 (41)

88755 (38)

4088 (38)

71790 (36)

1473 (54)

16965 (53)

  Female

7950 (59)

143820 (62)

6676 (62)

128474 (64)

1274 (46)

15346 (47)

Age at next generation tumor genomic sequencing

  <40

733 (5)

13111 (6)

574 (5)

11226 (6)

159 (6)

1885 (6)

  40–49

1477 (11)

26420 (11)

1139 (11)

21511 (11)

338 (12)

4909 (15)

  50–59

3297 (24)

61142 (26)

2616 (24)

52618 (26)

681 (25)

8524 (26)

  60–69

4277 (32)

74818 (32)

3432 (32)

65689 (33)

845 (31)

9129 (28)

  70–79

2864 (21)

45914 (20)

2301 (21)

39553 (20)

563 (20)

6361 (20)

  80+

863 (6)

11170 (5)

702 (7)

9667 (5)

161 (6)

1503 (5)

Race as recorded in the electronic health record

  Asian

439 (3)

7914 (3)

361 (3)

6902 (3)

78 (3)

1012 (3)

  African-American

445 (3)

7785 (3)

344 (3)

6550 (3)

101 (4)

1235 (4)

  Native American

13 (<1)

99 (<1)

11 (<1)

93 (<1)

2 (<1)

6 (<1)

  Pacific Islander

4 (<1)

123 (<1)

4 (<1)

123 (<1)

0 (0)

0 (0)

  White

12132 (90)

207897 (89)

9653 (90)

179147 (89)

2479 (90)

28750 (89)

  More than one race

36 (<1)

738 (<1)

31 (<1)

655 (<1)

5 (<1)

83 (<1)

  Other/unknown

442 (3)

8019 (3)

360 (3)

6794 (3)

82 (3)

1225 (4)

Cancer type

  Breast

2382 (18)

47595 (20)

1972 (18)

41462 (21)

409 (15)

6105 (19)

  Colorectal

2447 (18)

35459 (15)

1922 (18)

29451 (15)

526 (19)

6011 (19)

  Endometrial

524 (4)

6754 (3)

524 (5)

6754 (3)

0 (0)

0 (0)

  Gastroesophageal

1019 (8)

16363 (7)

1019 (9)

16363 (8)

0 (0)

0 (0)

  Head and neck

447 (3)

10901 (5)

446 (4)

10898 (5)

0 (0)

0 (0)

  Leiomyosarcoma

168 (1)

4581 (2)

168 (2)

4581 (2)

0 (0)

0 (0)

  Non-small cell lung

2838 (21)

43360 (19)

2297 (21)

38090 (19)

540 (20)

5237 (16)

  Melanoma

756 (6)

19100 (8)

754 (7)

19064 (10)

0 (0)

0 (0)

  Ovarian

713 (5)

19885 (9)

713 (7)

19885 (10)

0 (0)

0 (0)

  Pancreatic

878 (6)

9111 (4)

397 (4)

5016 (3)

485 (18)

4173 (13)

  Prostate

549 (4)

7818 (3)

99 (<1)

1678 (8)

451 (16)

6167 (19)

  Renal cell carcinoma

364 (3)

4434 (2)

93 (<1)

756 (<1)

271 (10)

3680 (11)

  Urothelial carcinoma

426 (3)

7214 (3)

360 (3)

6266 (3)

65 (2)

938 (3)

Common tumor genomic variants

  TP53 mutation

5780 (43)

99351 (43)

4675 (43)

87185 (44)

1105 (40)

12166 (38)

  KRAS mutation

2993 (22)

39485 (17)

2161 (20)

31814 (16)

832 (30)

7671 (24)

  PIK3CA mutation

1897 (14)

32896 (14)

1618 (15)

29345 (15)

279 (10)

3551 (11)

  APC mutation

1457 (11)

22278 (10)

1147 (11)

18628 (9)

310 (11)

3650 (11)

  BRAF mutation

727 (5)

13091 (6)

628 (6)

12071 (6)

99 (4)

1020 (3)