The authors introduce a method to estimate the intrinsic dimension of a dataset, which quantifies the minimum number of independent coordinates needed to describe the data, when this is of binary type. The algorithm can be used to quantify global correlation in datasets with hundreds of thousands of features, with only a few thousand samples, for example in neural network representations.
- Santiago Acevedo
- Alex Rodriguez
- Alessandro Laio