Table 1 Baseline characteristics.

From: Cohort design and natural language processing to reduce bias in electronic health records research

 

C3PO1 (N = 520,868)

C3PO – MI/stroke (N = 198,184)2

MI/stroke Convenience Sample (N = 340,226)2

C3PO – AF (N = 174,644)2

AF Convenience Sample (N = 501,272)2

 

Mean ± SD, Median (quartile 1, quartile 3), or N (%)

Age (years)

48.4 ± 17.1

57.0 ± 10.3

56.2 ± 10.4

60.9 ± 10.0

61.4 ± 10.5

Women

315,577 (60.6%)

116,448 (58.8%)

195,039 (57.3%)

106,279 (60.9%)

288,334 (57.5%)

White

389,755 (74.8%)

154,712 (78.1%)

270,002 (79.4%)

140,746 (79.6%)

422,266 (84.2%)

Black

38,104 (7.3%)

13,805 (7.0%)

21,248 (6.2%)

11,103 (6.4%)

22,787 (4.5%)

Hispanic or Latino

33,762 (6.5%)

9401 (4.7%)

15,142 (4.5%)

6804 (3.9%)

14,115 (2.8%)

Asian or Pacific Islander

21,701 (4.2%)

7807 (3.9%)

13,219 (3.9%)

6003 (3.4%)

14,329 (2.9%)

Mixed

27 (0.05%)

11 (0.06%)

24 (0.07%)

7 (0.04%)

23 (0.04%)

Other

18,774 (3.6%)

5716 (2.9%)

8937 (2.6%)

4467 (2.6%)

9023 (1.8%)

Unknown

18,745 (3.6%)

6732 (3.4%)

11,654 (3.4%)

5514 (3.2%)

18,729 (3.7%)

Height (cm)

167.4 ± 10.4

166.6 ± 10.4

167.4 ± 10.3

Weight (kg)

78.3 ± 20.3

79.4 ± 19.5

79.8 ± 19.8

Systolic blood pressure (mmHg)

123 ± 17

126 ± 17

127 ± 18

128 ± 17

130 ± 19

Diastolic blood pressure (mmHg)

75 ± 10

76 ± 10

77 ± 11

Current smoker

27,202 (5.2%)

14,720 (7.4%)

12,652 (3.7%)

14,031 (8.0%)

22,020 (4.4%)

Anti-hypertensive use

147,898 (28.4%)

77,827 (39.3%)

119,954 (35.3%)

78,219 (44.8%)

173,235 (34.6%)

Diabetes

58,159 (11.2%)

29,307 (14.8%)

43,966 (12.9%)

27,953 (16.0%)

52,180 (10.4%)

Heart failure

12,555 (2.4%)

3334 (1.9%)

16,786 (3.3%)

Myocardial infarction

17,937 (3.4%)

6641 (3.8%)

18,260 (3.6%)

Total cholesterol (g/dL)

189 ± 39

195 ± 39

194 ± 40

HDL cholesterol (g/dL)

55 ± 18

57 ± 18

57 ± 18

Follow-up, years

7.2 (2.6, 12.9)

7.3 (2.8, 11.9)

7.4 (3.5, 11.8)

6.5 (2.5, 11.1)

5.4 (2.2, 9.8)

  1. 1Values shown exclude missing data.
  2. 2Only variables relevant for each risk score (CHARGE-AF for AF, PCE for MI/stroke) are depicted.