Fig. 4: Demonstration of computing loss reduction resulting from one split in a decision tree.

This reduction value for a given feature is summed all over a single tree, then gets averaged over all trees. y, \(\hat{y}\), and x1 denote the target variable, predicted value, and selected feature for splitting, respectively.