Scientific Reports

Table 1 Checklist for fixing randomness.

From: Exploring the role of preprocessing combinations in hyperspectral imaging for deep learning colorectal cancer detection

Data	Same data in the same order	It is essential to ensure that the model (neural network) receives the same data in the same order, with preprocessing being the only variable component.
Data	Representative subset	Evaluating all combinations on the full dataset would be impractical because the time required for testing even a single combination would take 5–6 days. However, instead of selecting a random subset of the dataset, we aimed to choose a subset that is representative of the entire dataset’s characteristics. We developed an algorithm for this purpose, described in Sect. 3.1.1.
Model	Fixed models	For all preprocessing combinations, we used the same 3D-CNN model, described in Sect. 3.3.1.
	Weights and biases initialization	We initialized weights and biases always to the same values to ensure that the starting point of training is the same.
	Ensure that randomness inside layers like Dropout is turned off	Some layers use randomness, e.g., Dropout that randomly drops connections to avoid overfitting. It is important to set the same seed to such layers to make their behavior reproducible.
	Set a seed to Tensorflow, Python and other libraries	Set a seed to Tensorflow, Python and other libraries that may contain randomization to eliminate effects of other possible random operations.
Cross-validation	Mapping between test patients and validation patients	We used Leave-One-Out-Cross-Validation (Sect. 3.3.1.), effectively 56-fold cross-validation (one-fold for each patient). In each fold, we excluded 1 patient as test set, and 3 patients as validation set, all other patients were in the train set. To ensure fair comparison between combinations, a mapping between test patients and validation patients was created that was used for all combinations. In other words, each time Patient X was excluded as test set in any preprocessing combination, the same corresponding 3 patients were chosen for the validation set.

Back to article page

Search

Advanced search

Quick links