Table 4 Retrieval performance metrics for different models at K = 5 and K = 10.

K	Method	Overall			Multiple			Single
K	Method	Recall	MAP	NDCG	Recall	MAP	NDCG	Recall	MAP	NDCG
5	Native RAG	54.6 ± 1.1	52.5 ± 0.9	62.9 ± 1.3	26.9 ± 1.5	42.5 ± 1.8	49.5 ± 2.1	70.6 ± 1.0	58.3 ± 0.8	70.6 ± 1.0
	Temporal Filter	49.0 ± 1.4	45.1 ± 1.2	56.8 ± 1.5	15.5 ± 1.9	23.2 ± 2.2	30.8 ± 2.5	68.5 ± 1.2	57.8 ± 1.1	70.9 ± 0.9
	Query Rewrite	55.7^† ± 1.0	53.3^† ± 0.8	64.0^† ± 1.1	29.0^† ± 1.4	44.2^† ± 1.6	51.6^† ± 1.9	71.3^† ± 0.9	58.6 ± 0.7	71.3^† ± 0.9
	Query Decomposition	61.9^† ± 0.9	56.4^† ± 0.7	71.8^† ± 1.0	45.2^† ± 1.2	52.4^† ± 1.4	72.1^† ± 1.5	71.6^† ± 0.9	58.6 ± 0.7	71.6^† ± 0.9
10	Native RAG	61.8 ± 1.0	53.5 ± 0.9	71.8 ± 1.1	35.6 ± 1.6	43.7 ± 1.7	62.6 ± 1.9	77.1 ± 0.9	59.1 ± 0.8	77.1 ± 0.9
	Temporal Filter	51.7 ± 1.3	43.7 ± 1.3	60.7 ± 1.6	17.5 ± 2.0	20.6 ± 2.4	35.5 ± 2.7	71.6 ± 1.1	57.2 ± 1.2	74.3 ± 1.0
	Query Rewrite	62.6^† ± 0.9	53.8 ± 0.8	72.3^† ± 1.0	36.7^† ± 1.5	44.3^† ± 1.6	63.2^† ± 1.8	77.7^† ± 0.8	59.3 ± 0.7	77.7^† ± 0.8
	Query Decomposition	68.2^† ± 0.8	57.6^† ± 0.7	78.9^† ± 0.9	53.9^† ± 1.3	53.1^† ± 1.5	83.2^† ± 1.4	76.5 ± 0.9	59.3 ± 0.7	76.5 ± 0.9

All metrics are reported in percentage (%). Results are reported as mean ± standard deviation. The best result in each column is bolded. ^† indicates a statistically significant improvement over the Native RAG baseline (p < 0.05).

Quick links

Search