Fig. 1: Illustration of a 3-way split of sequential datasets. | Nature Communications

Fig. 1: Illustration of a 3-way split of sequential datasets.

From: Predicting microbial community structure and temporal dynamics by using graph neural network models

Fig. 1: Illustration of a 3-way split of sequential datasets.

a The datasets are divided chronologically into 3 parts, where the first part is the training dataset, the second is the validation dataset, and the last is the test dataset. The train and validation datasets are used to train and optimize each neural network model until the value calculated by the loss function no longer improves. The final model is then tested on a separate test dataset and a numeric error is calculated between the real values and the predicted values. b Schematic of the overall model architecture. The historical values for each Amplicon Sequence Variant (ASV) (1) are first clustered by a few different methods (2): a graph neural network is used to infer putative interactions between ASVs and cluster them accordingly. Additionally, clustering by known biological functions, Improved Deep Embedded Clustering (IDEC), and ranked abundances were also tested. Then, a temporal convolution network is trained separately on each cluster to extract temporal features (3). Finally, a chosen number of predicted consecutive sample points are obtained through the output layer (4).

Back to article page