Figure 5

Benchmarking (a) demagnetization field and (b) exchange field for different system sizes N on an Intel(R) Xeon 6326 CPU @ 2.90 GHz using one NVIDIA A100 80GB GPU (CUDA Driver 11.8). An average of 10000 evaluations has been measured for each field term. Before measurent begins, 1000 warm-up loops are used to ensure that the GPU has reached its maximum performance state. Single precision arithmetics are used for comparison with mumax3.