Fig. 3: GPU memory boosting further accelerates DHS. | Nature Communications

Fig. 3: GPU memory boosting further accelerates DHS.

From: A GPU-based computational framework that bridges neuron simulation and artificial intelligence

Fig. 3

a GPU architecture and its memory hierarchy. Each GPU contains massive processing units (stream processors). Different types of memory have different throughput. b Architecture of Streaming Multiprocessors (SMs). Each SM contains multiple streaming processors, registers, and L1 cache. c Applying DHS on two neurons, each with four threads. During simulation, each thread executes on one stream processor. d Memory optimization strategy on GPU. Top panel, thread assignment and data storage of DHS, before (left) and after (right) memory boosting. Bottom, an example of a single step in triangularization when simulating two neurons in d. Processors send a data request to load data for each thread from global memory. Without memory boosting (left), it takes seven transactions to load all request data and some extra transactions for intermediate results. With memory boosting (right), it takes only two transactions to load all request data, registers are used for intermediate results, which further improve memory throughput. e Run time of DHS (32 threads each cell) with and without memory boosting on multiple layer 5 pyramidal models with spines. f Speed up of memory boosting on multiple layer 5 pyramidal models with spines. Memory boosting brings 1.6-2 times speedup.

Back to article page