Extended Data Fig. 2: Properties of large-scale embeddings. | Nature

Extended Data Fig. 2: Properties of large-scale embeddings.

From: Nearest neighbours reveal fast and slow components of motor learning

Extended Data Fig. 2

a, Three auditory features computed on renditions of syllable b. This panel uses the same embedding as in Fig. 2a, but with different colours. b, Across-day change in vocalizations. This is a magnified cutout from the bottom left region of the dashed outline of Fig. 2a. The colours differ from Fig. 2a, and points from days 50–56 only are shown. c, Within-day change in vocalizations. Points from b are shown separately for three individual days and coloured according to production time within the day (early to late). Vocalizations change within a day: early vocalizations (dark green) are more similar to vocalizations from previous days (dark green points in b); late vocalizations (light blue) are more similar to vocalizations from future days (light blue in b). d, t-SNE visualizations for dense recordings from three birds (analogous to Fig. 2a). eg, Illustration of a fictitious behaviour that undergoes distinct phases of abrupt change, no change and gradual change, and the identification of these phases on the basis of nearest-neighbour graphs. e, A low-dimensional representation of the behaviour. Each point corresponds to a behavioural rendition (for example, a syllable rendition) and is coloured according to production time. Similar renditions (for example, syllable renditions with similar spectrograms) appear near each other in this representation. The dotted ellipses mark three subsets of points corresponding to: (1) a phase of abrupt change; (2) a phase of no change; and (3) a phase of gradual change. f, Nearest-neighbour graphs for the three subsets of points in e. Points are replotted from e with different symbols, indicating whether their production times fall within the first half (squares) or second half (crosses) of the corresponding subset. Edges connect each point to its five nearest neighbours. The edge colour marks neighbouring pairs of points falling into the same (black) or different (red) halves. Relative counts of within- and across-half edges differ according to the nature of the underlying behavioural change (histograms of edge counts). If an abrupt change in behaviour occurs between the first and second half, nearest neighbours of points in one half will all be points from the same half, and none from the other half (discontinuity). When behaviour is stationary, the neighbourhoods are maximally mixed: that is, every point has about an equal number of neighbours from the two halves. Phases of gradual change result in intermediate levels of mixing. g, Mixing matrix for the simulated data in e, analogous to Fig. 2e. Each location in the matrix corresponds to a pair of production times. Strong mixing (white) indicates a large number of nearest-neighbour edges across the two corresponding production times (as in f; stationary) and thus similar behaviour at the two times. Weak mixing (black) indicates a small number of such edges (as in f; discontinuity), and thus dissimilar behaviour. Note that such statistics on the composition of local neighbourhoods can be computed for any kind of behaviour and are invariant with respect to transformations of the data that preserve nearest neighbours, such as scaling, translation and rotation. These properties make nearest-neighbour approaches highly general.

Back to article page