Fig. 1: Visualization of our framework.
From: Understanding quantum machine learning also requires rethinking generalization

a In the empirical experiments, a distribution of labeled quantum data \({{{{{{{\mathcal{D}}}}}}}}\) undergoes a randomization process, leading to a corrupted data distribution \(\hat{{{{{{{{\mathcal{D}}}}}}}}}\). The training and a test set are drawn independently from each distribution. Then, the training sets are fed into an optimization algorithm, which is employed to identify the best fit for each data set individually from a family of parameterized quantum circuits \({{{{{{{{\mathcal{F}}}}}}}}}_{Q}\). This process generates two hypotheses: one for the original data foriginal and another for the corrupted data fcorrupted. We empirically find that the labeling functions can perfectly fit the training data, leading to small training errors. In parallel, foriginal achieves a small test error, indicating good learning performance, and quantified by a small generalization gap gen(foriginal) = small. On the contrary, the randomization process causes fcorrupted to achieve a large test error, which in turn results in a large generalization gap gen(fcorrupted) = large. b Regarding uniform generalization bounds, it is worth noting that this corner of QML literature assigns the same upper bound gunif to the entire function family without considering the specific characteristics of each individual function. Finally, we combine two significant findings: (1) We have identified a hypothesis with a large empirical generalization gap, and (2) the uniform generalization bounds impose identical upper bounds on all hypotheses. Consequently, we conclude that any uniform generalization bound derived from the literature must be regarded as “large'', indicating that all such bounds are loose for that training data size. The notion of loose generalization bound does not exclude the possibility of achieving good generalization; rather, it fails to explain or predict such successful behavior.