Table 1 Two topics (topic–word distributions) selected from 200 topics learned by LDA using sentences in our database

From: Semi-supervised machine-learning classification of materials synthesis procedures

Sample sentences

Words of highest probability

Topics

“As-received ZrB2 powder was mixed with 2 wt% B4C powder (4.5 vol%) and 1 wt% carbon (2.5 vol%) in acetone by ball milling for 24 h using WC media.”44

P(ball) = 0.065

P(milling) = 0.051

P(h) = 0.042

P(milled) = 0.032

P(powder) = 0.031

P(mill) = 0.027

T1 (ball-)milling

“The Al powder was first ball milled in an atmosphere of supra-pure hydrogen for removing the small amount of oxide film on the surface.”45

“The solid product obtained was filtered, dried at 110 °C and finally calcined in air at 550 °C for 6 h at a heating rate of 1 °C/min.”46

P(°C) = 0.139

P(h) = 0.104

P(air) = 0.038

P(calcined) = 0.035

P(dried) = 0.028

P(K) = 0.016

T2 sintering

“Finally, the solid was calcined in air from RT to 500 °C at a heating rate of 2 °C min−1 and maintained for 4 h, which led to the formation of the MgO-Al2O3 support.”47

  1. Each topic is represented by a multinomial probability distribution over words. By interpreting the keywords (words of highest probability), we assign a human comprehensible label for each topic. Sample sentences from four articles44,45,46,47 are used to demonstrate different topics