Fig. 1: Overview of visiPAM.
From: Zero-shot visual reasoning through probabilistic analogical mapping

VisiPAM contains two core components: a vision module and a reasoning module. The vision module receives visual inputs in the form of either 2D images, or point-cloud representations of 3D objects, and uses deep learning components to extract structured visual representations. These representations take the form of attributed graphs, in which both nodes o1..N (corresponding to object parts) and edges r1..N(N−1) (corresponding to spatial relations between parts) are assigned attributes. The reasoning module then uses Probabilistic Analogical Mapping (PAM) to identify a mapping M from the nodes of the source graph G to the nodes of the target graph \({G}^{{\prime} }\), based on the similarity of mapped nodes and edges. Mappings are probabilistic, but subject to a soft isomorphism constraint (preference for one-to-one correspondences).