Table 1 Data used in training and testing.
Data type | Data source | Training data volume | Processing method | Test data volume |
|---|---|---|---|---|
Educational resource data | MOOC platform | 6,500 | Missing value completion (mean filling), denoising | 1,625 |
National wisdom education platform for higher education (https://www.chinaooc.com.cn/) | 3,300 | Feature scaling, sample balance | 825 | |
Teaching mode data | Higher Education Student Information Network of China | 100,000 | Standardization processing, feature scaling | 25,000 |
China Higher Education Quality Monitoring Platform (https://udb.eqea.edu.cn/passport/portal/index.html) | 60,000 | Stratified sampling, regularization | 15,000 | |
Student behavior data | Local Ministry of Education public records (http://www.moe.gov.cn/) | 59,050 | Outlier processing, smoothing processing, data merging | 14,762 |