Table 2 Algorithm: model-based DQN for materials design

From: Unlocking the black box beyond Bayesian global optimization for materials design using reinforcement learning

[1] Input: Design space X, initial dataset D0, GPR surrogate model M, DQN agent A, RL training episode N, episode length L (number of design decisions/actionsa per episode), Batch size Btrain for Q-network update, Optimization type: minimization

[2] Output: Best material design x*

[3] Initialize GPR surrogate model M with D0, D = D0

[4] Initialize DQN with Q-network Q(s, a)

[5] Initialize replay buffer R

[6] while not terminated do

// Until max iterations or target performance met

[7] Train/Update GPR model M using D

[8] // Agent trai ning stage: learn from surrogate model; train DQN for N episodes

[9] for episode = 1 to N do

// Training budget for DQN agent

[10] s = get_initial_state()

// Initialize with predefined or random state

[11] for step = 1 to L do

// Make L sequential design decisions/actions

[12] a = ε-greedy(Q(s,·), X)

// Select action within design space

[13] s’ = next_state(s, a)

// State transition from s to s'

[14] r = M(s) - M(s’)

// Get reward using GPR, assuming minimization

[15] Store (s, a, r, s’) in R

[16] Update Q(s, a) using a random batch of Btrain (s, a, r, s’) tuples from R

[17] s = s’

[18] end for

[19] end for

[20] // Design stage: propose and evaluate one new material designb

[21] s = get_initial_state()

// Initialize with predefined or random state

[22] for step = 1 to L do

// Make L sequential design decisions/actions

[23] a = ε-greedy(Q(s,·), X)

// Propose materials design action

[24] s’ = next_state(s, a)

// State transition from s to s'

[25] s = s’

[26] end for

[27] xnext = s

[28] ynext = f(xnext)

// Evaluate the new design

[29] D = D {(xnext, ynext)}

// Update dataset with experiment results

[30] end while

[31] return x* = argmax xD f(x)

  1. a “Design decisions/actions” refer to the actions taken to select element compositions for one material design.
  2. bThe term “material design” refers to obtaining one specific alloy composition.