Fig. 3: Effect of exponential concentration on training and generalization performance.

We consider a tensor product encoding for an engineered data set where each component is uniformly drawn from [0, 2π] and the true label is \({y}_{{{{{{{{\rm{true}}}}}}}}}({{{{{{{\boldsymbol{x}}}}}}}})=\sum\nolimits_{i=1}^{{N}_{s}}{w}_{i}{\kappa }^{{{{{{{{\rm{FQ}}}}}}}}}({{{{{{{\boldsymbol{{x}}}}}}}_{i}}},{{{{{{{\boldsymbol{x}}}}}}}})\) where wi is randomly chosen from [0, 1]. We train on Ns = 150 data points. In the main plot, the loss on a test dataset \({{{{{{{{\mathcal{S}}}}}}}}}_{{{{{{{{\rm{test}}}}}}}}}\) relative to its initial value (without training) is plotted as a function of increasing training data. In the inset, an absolute training error is plotted as a function of the increasing data. We note that each kernel value is estimated with N = 1000 and the number testing data points is 20. The training is done with no regularization λ = 0. We repeat this experiment 10 times. The solid curves represent averages of respective losses and the shaded areas represent standard deviations.