Table 2 Characteristics of patients with medical oncologist notes for analysis.
From: Artificial intelligence-aided clinical annotation of a large multi-cancer genomic dataset
Total number of patients and medical oncologist notes | Number of patients with unlabeled medical oncology notes and number of unlabeled medical oncology notes | Number of patients with labeled medical oncology notes and number of labeled medical oncology notes | ||||
|---|---|---|---|---|---|---|
Patients N (%) | Reports N (%) | Patients N (%) | Reports N (%) | Patients N (%) | Reports N (%) | |
Total cohort | 13511 (100) | 232575 (100) | 10764 (100) | 200264 (100) | 2747 (100) | 32311 (100) |
Sex | ||||||
Male | 5561 (41) | 88755 (38) | 4088 (38) | 71790 (36) | 1473 (54) | 16965 (53) |
Female | 7950 (59) | 143820 (62) | 6676 (62) | 128474 (64) | 1274 (46) | 15346 (47) |
Age at next generation tumor genomic sequencing | ||||||
<40 | 733 (5) | 13111 (6) | 574 (5) | 11226 (6) | 159 (6) | 1885 (6) |
40–49 | 1477 (11) | 26420 (11) | 1139 (11) | 21511 (11) | 338 (12) | 4909 (15) |
50–59 | 3297 (24) | 61142 (26) | 2616 (24) | 52618 (26) | 681 (25) | 8524 (26) |
60–69 | 4277 (32) | 74818 (32) | 3432 (32) | 65689 (33) | 845 (31) | 9129 (28) |
70–79 | 2864 (21) | 45914 (20) | 2301 (21) | 39553 (20) | 563 (20) | 6361 (20) |
80+ | 863 (6) | 11170 (5) | 702 (7) | 9667 (5) | 161 (6) | 1503 (5) |
Race as recorded in the electronic health record | ||||||
Asian | 439 (3) | 7914 (3) | 361 (3) | 6902 (3) | 78 (3) | 1012 (3) |
African-American | 445 (3) | 7785 (3) | 344 (3) | 6550 (3) | 101 (4) | 1235 (4) |
Native American | 13 (<1) | 99 (<1) | 11 (<1) | 93 (<1) | 2 (<1) | 6 (<1) |
Pacific Islander | 4 (<1) | 123 (<1) | 4 (<1) | 123 (<1) | 0 (0) | 0 (0) |
White | 12132 (90) | 207897 (89) | 9653 (90) | 179147 (89) | 2479 (90) | 28750 (89) |
More than one race | 36 (<1) | 738 (<1) | 31 (<1) | 655 (<1) | 5 (<1) | 83 (<1) |
Other/unknown | 442 (3) | 8019 (3) | 360 (3) | 6794 (3) | 82 (3) | 1225 (4) |
Cancer type | ||||||
Breast | 2382 (18) | 47595 (20) | 1972 (18) | 41462 (21) | 409 (15) | 6105 (19) |
Colorectal | 2447 (18) | 35459 (15) | 1922 (18) | 29451 (15) | 526 (19) | 6011 (19) |
Endometrial | 524 (4) | 6754 (3) | 524 (5) | 6754 (3) | 0 (0) | 0 (0) |
Gastroesophageal | 1019 (8) | 16363 (7) | 1019 (9) | 16363 (8) | 0 (0) | 0 (0) |
Head and neck | 447 (3) | 10901 (5) | 446 (4) | 10898 (5) | 0 (0) | 0 (0) |
Leiomyosarcoma | 168 (1) | 4581 (2) | 168 (2) | 4581 (2) | 0 (0) | 0 (0) |
Non-small cell lung | 2838 (21) | 43360 (19) | 2297 (21) | 38090 (19) | 540 (20) | 5237 (16) |
Melanoma | 756 (6) | 19100 (8) | 754 (7) | 19064 (10) | 0 (0) | 0 (0) |
Ovarian | 713 (5) | 19885 (9) | 713 (7) | 19885 (10) | 0 (0) | 0 (0) |
Pancreatic | 878 (6) | 9111 (4) | 397 (4) | 5016 (3) | 485 (18) | 4173 (13) |
Prostate | 549 (4) | 7818 (3) | 99 (<1) | 1678 (8) | 451 (16) | 6167 (19) |
Renal cell carcinoma | 364 (3) | 4434 (2) | 93 (<1) | 756 (<1) | 271 (10) | 3680 (11) |
Urothelial carcinoma | 426 (3) | 7214 (3) | 360 (3) | 6266 (3) | 65 (2) | 938 (3) |
Common tumor genomic variants | ||||||
TP53 mutation | 5780 (43) | 99351 (43) | 4675 (43) | 87185 (44) | 1105 (40) | 12166 (38) |
KRAS mutation | 2993 (22) | 39485 (17) | 2161 (20) | 31814 (16) | 832 (30) | 7671 (24) |
PIK3CA mutation | 1897 (14) | 32896 (14) | 1618 (15) | 29345 (15) | 279 (10) | 3551 (11) |
APC mutation | 1457 (11) | 22278 (10) | 1147 (11) | 18628 (9) | 310 (11) | 3650 (11) |
BRAF mutation | 727 (5) | 13091 (6) | 628 (6) | 12071 (6) | 99 (4) | 1020 (3) |