Table 4 The fairness assessment results of the fair recommendation results identified by various LLMs in the recommendation results respectively generated by VAErec and VAEgan in LastFM.
From: Fairness identification of large language models in recommendation
N | LLM | User number | Model | \(\chi ^2\)@100\(\downarrow\) | K.T@100\(\uparrow\) | ||
---|---|---|---|---|---|---|---|
Gender | Age | Gender | Age | ||||
– | VAErec | 2636.1360 | 4544.0670 | 0.3768 | 0.0782 | ||
– | VAEgan | 1670.9327 | 3007.7690 | 0.5055 | 0.2610 | ||
5 | Chatglm3 | G(79.80%) | VAErec | 770.0154 | 768.5830 | 0.5026 | 0.4466 |
A(38.62%) | VAEgan | 1439.1733 | 1427.4130 | 0.4735 | 0.2627 | ||
Llama2 | G(0.49%) | VAErec | – | 647.3035 | − 0.7463 | − 0.3065 | |
A(3.15%) | VAEgan | – | 678.5224 | − 0.8080 | − 0.4086 | ||
10 | Chatglm3 | G(79.70%) | VAErec | 753.3401 | 611.2097 | 0.5091 | 0.4855 |
A(42.17%) | VAEgan | 1292.3916 | 1974.0910 | 0.4651 | 0.1896 | ||
Llama2 | G (0.79%) | VAErec | – | 565.7983 | − 0.7960 | − 0.6053 | |
A(9.95%) | VAEgan | – | 759.0048 | − 0.7694 | − 0.3955 |