Table 1 Baseline characteristics of the dataset

From: Development of a respiratory virus risk model with environmental data based on interpretable machine learning methods

Characteristics

Training set(15329)

Test set (3832)

Age (years old)

2(IQR:3.07)

2(IQR:3.17)

Sex

 Male

9195(60%)

2286(59.7%)

 Female

6031(39.3%)

1496(39.0%)

 Unknown

103(0.7%)

50(1.3%)

Season

 Spring

2335(15.2%)

867(22.6%)

 Summer

3625(23.7%)

1698(44.3%)

 Autumn

5383(35.1%)

728(19%)

 Winter

3925(26%)

539(14.1%)

Holiday

 Yes

3473(22.7%)

764(20.0%)

 No

11856(77.3%)

3068(80.0%)

Air quality

 AQI

46.37(IQR:26.98)

39.89(IQR:17.58)

 CO (mg/m3)

0.69(IQR:0.24)

0.70(IQR:0.24)

 NO2 (µg/m3)

26.16(IQR:18.76)

26.82(IQR:18.90)

 O3 (µg/m3)

56.95(IQR:33.87)

49.84(IQR:27.48)

 PM10 (µg/m3)

45.04(IQR:33.00)

37.55(IQR:19.94)

 PM2.5(µg/m3)

25.82(IQR:20.34)

21.76(IQR:12.93)

 SO2 (µg/m3)

6.86(IQR:3.90)

6.32(IQR:2.92)

Environmental factors

 air temperature (°C)

23.355(IQR:1.05)

26.367(IQR:0.87)

 dew point temperature (°C)

17.907(IQR:1.25)

22.266(IQR:0.83)

 wind direction (°)

161.19(IQR:55.41)

164.50(IQR:48.44)

 wind speed rate (m/s)

30.79(IQR:15.91)

26.15(IQR:13.70)