Table 8 Performance improvements across different datasets showing project-specific characteristics.
Dataset | File count | Complexity | Baseline F1 | Our F1 | Improvement | Our AUC | AUC gain |
|---|---|---|---|---|---|---|---|
Apache Ant | 1,248 | Low | 0.755 | 0.811 | + 7.4% | 0.889 | + 6.8% |
Eclipse JDT | 2,156 | Medium | 0.741 | 0.808 | + 9.0% | 0.892 | + 7.2% |
Apache Camel | 3,874 | High | 0.729 | 0.811 | + 11.2% | 0.901 | + 8.5% |
Mozilla Firefox | 4,521 | High | 0.738 | 0.815 | + 10.4% | 0.898 | + 8.1% |
Apache Hadoop | 2,893 | Medium | 0.746 | 0.809 | + 8.4% | 0.894 | + 7.4% |
Linux Kernel | 8,456 | Very High | 0.732 | 0.811 | + 10.8% | 0.899 | + 8.3% |