Fig. 6: Illustration of AutoPhrase (Shang et al., 2018).
From: Phrase-level pairwise topic modeling to uncover helpful peer responses to online suicidal crises

It generates phrase candidates with frequent n-gram mining and builds a random forest classifier with distant supervision (Mintz et al., 2009) from general knowledge bases (e.g., Freebase and Wikipedia). The classifier has (1) positive labels from knowledge bases and negative labels through robust sampling and (2) a rich set of features of phrase quality. Applied to the corpus, it assigns good phrase candidates with a high score.