Supplementary Figure 6: Detailed explanation of how ChIP-seq peaks were divided into training and testing data for each experiment.
From: Predicting the sequence specificities of DNA- and RNA-binding proteins by deep learning

The ChIP-seq performance from Figure 3e are reproduced at left with extra annotations for clarity. At right is the breakdown of ChIPseq peaks used to train a model on each ChIP experiment. We train each method on peaks labeled A (“top 500 odd”), then test each method on peaks labeled B (“top 500 even”). DeepBind* is a special case where we show that including the lower -ranked peaks labeled C (“all remaining peaks”) in the training set can significantly improve the accuracy when scoring the top-ranked peaks labeled B.