Table 3 Ablation study.

System	Bpref	MAP	P@5	P@10	nDCG@10
Retrieval
SBERT	0.3594	0.1128	0.4640	0.4180	0.3658
TF-IDF	0.2567	0.0781	0.3320	0.3380	0.2567
BM25	0.4581	0.1313	0.2360	0.2300	0.2221
Retrieval (all above)	0.5146	0.2987	0.8680	0.8200	0.7254
Re-Ranking
Retrieval + QA	0.5205	0.3075	0.8720	0.8210	0.7298
Retrieval + AS	0.5246	0.3049	0.8680	0.8235	0.7312
Retrieval + QA + AS	0.5253	0.3089	0.8760	0.8260	0.7488

We iteratively eliminate various pieces of the search engine in order to compute their effect on the system’s performance. In the retrieval subsystem (top half), Siamese-BERT semantic retrieval (SBERT) and keyword-based retrieval (TF-IDF, BM25) each perform substantially worse than the unified whole (Retrieval). In the re-ranker subsystem (bottom half), both the Question–Answering (QA) and Abstractive Summarization (AS) modules marginally boost the performance of the retrieval metrics.
Bold values indicate the top-scoring system for the given column’s metric.

Quick links

Search