Abstract
Evaluating cumulus-oocyte complex (COC) morphology is commonly used to assess oocyte quality. However, clear guidelines on interpreting COC morphology data are lacking as this evaluation method is subjective. In the present study, individual in vitro embryo production was used, allowing follow-up of blastocyst formation for each COC. Images of immature COCs were presented to embryologists and two artificial intelligence (AI) models: deep neural network (DNN) and random forest classifier (RF). The aims were to (1) determine the most relevant morphological characteristics in distinguishing qualitative COCs, (2) review human-made predictions, and (3) build predictive AI models. Our experiments identified cumulus size as pivotal characteristic of COC quality, while embryologists assigned ooplasm morphology as most important. Inspection of COCs by the human eye showed significant limitations, as evidenced by their low predictive ability (balanced accuracy: 42.9%) and fair reliability. Our AI models outperformed the embryologists, yielding a balanced accuracy of 79.3% and 71.2% for DNN and RF, respectively. The first AI models that successfully predict developmental competence of immature bovine oocytes were created, outperforming embryologists and offering an objective perspective for COC morphology assessment. AI has emerged as a novel tool for oocyte appreciation, assisting decision-making in the embryology lab.
Similar content being viewed by others
Introduction
In vitro embryo production (IVP), consisting of in vitro oocyte maturation, fertilization and embryo culture, is widely used as a treatment to overcome subfertility in humans and to increase the number of offspring from genetically valuable domestic animals. The application of IVP in livestock has globally increased in the last decades, as more than one million bovine IVP embryos were transferred to recipient cows in 20221. However, there is considerable room for improvement, as the average blastocyst rate after in vitro maturation, fertilization and culture lies around 30 to 40%2,3,4, while being 58% when oocytes are matured in vivo4. This limited success can mainly be attributed to the impaired quality of in vitro matured oocytes4.
Intrinsic oocyte quality is often appreciated by the morphological appearance of the cumulus-oocyte complex (COC)5,6,7,8,9. Puncturing ovarian follicles mostly results in a heterogeneous group of immature COCs manifesting a large morphological variety. Already in 1979, Leibfried and First10 were the first to demonstrate the association between COC morphology and the oocyte’s capacity to mature in vitro. Since then, COC morphology assessment has been used in most bovine embryology labs as an indicator of oocyte quality, and the relationship between COC morphology and quality has been studied in numerous publications5,6,7,8,9,11,12,13. The size of the oocyte12,13 and the cumulus morphology6,9 are often taken into account when COCs are evaluated. Oocytes that are completely covered by multiple cumulus layers have a higher probability of developing into a blastocyst, compared to oocytes with an exposed corona radiata or zona pellucida6,9. Also, the density of the cumulus cells is considered important as COCs with highly expanded cumulus have increased degeneration rates compared to COCs with more compact cumulus cells5,11. Nevertheless, it was demonstrated that COCs with a slight degree of expansion in the outer cumulus layers progressed faster throughout meiosis14 and had higher chances to reach the morula stage at day 5 of embryo culture, compared to compact COCs11. Bovine oocyte quality is also reflected by the color of the ooplasm7,8, as a dark ooplasm reveals lipid accumulation15 and results in increased fertilization rates and higher developmental potential compared to pale-colored oocytes7,8. Altogether, several morphological parameters of the COC are considered traits of oocyte quality, and their assessment can be performed by a simple evaluation using a stereomicroscope. Still, the results of morphology assessment are highly dependent on the subjective interpretation of embryologists16.
Alternative non-invasive evaluation methods for COC quality evaluation are brilliant cresyl blue staining17, polarized light microscopy16,18, polar body morphology19,20, timing of polar body extrusion21,22, di-electrophoretic migration23 and cumulus biopsy for gene expression analysis24,25,26,27. However, some of these techniques require specialized equipment and concordant know-how. As most of them are time-consuming, these techniques cannot be used to select the most qualitative COCs for in vitro processing immediately at the time of collection, contrary to morphology assessment.
Artificial intelligence (AI) has made its entrance into the embryology lab, as multiple studies have been applying machine learning (ML) models to support decision-making in semen analysis, sperm viability assays, ovarian stimulation protocols and embryo grading, amongst others (comprehensively reviewed by Güell, 202428 and Hanassab et al., 202429). Furthermore, ML could add considerably high value in the future of oocyte appreciation, as it could assist in making accurate predictions of the oocytes’ developmental competence. As such, several models have been proposed to predict fertilization and blastocyst development using static images of matured, denuded human oocytes30,31,32. Animal studies used ML to predict oocytes’ maturation potential in mice33 and developmental competence in cattle34, but no association with embryo development could be demonstrated in the bovine model34. These previous studies showed the important potential of AI in the embryology lab, although no model has managed yet to predict the developmental competence of immature COCs based on brightfield microscopy in humans or livestock35.
Most bovine oocytes are destined to undergo atresia and degenerate, as cows are mono-ovulatory animals36. Therefore, evaluating oocyte quality could be the key to enhancing IVP efficiency. In the present study, we performed IVP experiments in individual culture, questioned embryologists and employed AI models aiming to: (1) study which morphological characteristics of the immature COC are most prominently associated with developmental competence, (2) evaluate the accuracy and reliability of the human eye regarding morphology evaluation, and (3) develop ML models to predict COCs’ developmental competence.
Methods
No ethical approval was required for this experimental design since bovine ovaries were collected post-mortem in a commercial abattoir.
Experimental design
Immature bovine COCs were matured, fertilized and cultured individually in vitro for eight days. Images were taken of each COC before and after maturation, and linked to the developmental competence (blastocyst or not). Images of the immature COCs were presented to embryologists and laymen through a survey, polling for their ability to predict developmental competence. Using the same survey, the reliability of morphology assessment by the human eye was studied, by measuring inter- and intra-rater agreements.
The original dataset, obtained by IVP experiments, was also used to create a segmentation model and to train deep neural network (DNN) and random forest (RF) models, employing blastocyst development at day eight as ground truth. The models were tested by evaluating their predicting ability on a test dataset. Like this, a side-by-side comparison was made between the predicting ability of embryologists and ML models. Eventually, morphological parameters that were most decisive in the decision-making processes of the RF model were extracted.
Individual in vitro embryo culture
Media and reagents
Physiological saline, tissue culture medium (TCM)-199 and gentamycin were purchased from Gibco (Life Technologies Europe, Ghent, Belgium). Paraffin oil was purchased from SAGE (CooperSurgical, Malov, Denmark). All other chemicals were obtained from Sigma-Aldrich (Overijse, Belgium) unless otherwise listed. All media were filtered before use with a 0.22 μm syringe filter (HE Healthcare-Whatman, Diegem, Belgium).
Individual in vitro embryo production
Individual IVP was performed as previously described by Raes et al.37. Briefly, bovine ovaries were collected at a local slaughterhouse and processed within 2 h. Ovaries were washed three times in warm physiological saline supplemented with kanamycin (25 mg/mL). Cumulus-oocyte complexes were aspirated from antral follicles (4–8 mm diameter) using an 18 G needle attached to a 10 mL syringe. Oocytes without cumulus cells and/or oocytes with a non-intact zona pellucida were excluded for further processing. All other COCs (n = 1095, 14 replicates) were selected for individual in vitro maturation and placed in a 20 µL droplet of maturation medium (i.e. modified bicarbonate buffered TCM-199 supplemented with 20 ng/mL epidermal growth factor and 50 µg/mL gentamycin). The droplets were prepared per 17 in Petri dishes (60 × 26 mm; Thermo Fisher Scientific, Waltham, MA USA) and covered with paraffin oil. In vitro maturation took place for 22 h at 38.5 °C in 5% C2 in humidified air. Spermatozoa of a bull with known fertility were passed over a Percoll gradient (GE Healthcare Biosciences, Uppsala, Sweden) and added to IVF-Tyrode’s Albumin Lactate Pyruvate (TALP) medium supplemented with bovine serum albumin (BSA; Sigma A8806; 6 mg/ml) and heparin (20 µg/mL) up to a concentration of 1 × 106 spermatozoa/mL. Mature oocytes were then washed individually in IVF-Talp medium and co-incubated with spermatozoa in 20 µL droplets of IVF-Talp medium, covered with paraffin oil for 21 h at 38.5 °C in 5% C2 humidified air. After fertilization, cumulus cells were removed by gentle pipetting (140 μm EZ-Tip, CooperSurgical, Malov, Denmark). Presumed zygotes were transferred individually to 20 µL droplets of synthetic oviductal fluid (SOF) medium supplemented with 0.4% BA (Sigma A9647) and ITS (5 µg/mL insulin + 5 µg/mL transferrin + 5 ng/mL selenium), covered with paraffin oil, and incubated at 38.5 °C in 5% C2, 5% O and 90% N up to day eight post-fertilization.
Collection of images and endpoint parameters
Images were taken from every COC at the beginning (immature oocyte, t = 0 h) and at the end (mature oocyte, t = 22 h) of the maturation period using a ToupCam camera connected to ToupView software (ToupTek, version 3.7.13270.20181102) on an inverted Olympus microscope. Each image visualized one COC, with the zona pellucida set as the plane of focus. All images were obtained under the same magnification (56X) and saved as PNG files at a resolution of 2592 × 1944 pixels in red, green and blue (RGB). Each immature COC was categorized based on the morphology of its cumulus cells and ooplasm (Fig. 1). The categories were extracted from the work of Wurth and Kruip38, who considered the density of the cumulus cells and the appearance of the ooplasm for COC categorization, and from Kakkassery et al.39 who categorized based on the number of cumulus cell layers and ooplasm granulation. Cumulus morphology categories were defined as follows: “1”: cumulus consists of more than 5 layers, the cells are compact and dense; “2”: cumulus consists of more than 5 layers, the cells are less compact and start to expand; “3”: cumulus consists of more than 5 layers, cells are expanded, “4”: cumulus consists of less than 5 layers and/or cells are not completely surrounding the oocyte. Ooplasm morphology categories were the following: “A”: ooplasm is homogeneously dark; “B”: ooplasm is dark and slightly granular; “C”: ooplasm is a heterogeneous mix of dark and pale areas. Categorization of COC morphology was performed by a single person, who had experience in bovine oocyte grading and IVP.
At 45 h post-fertilization, the cleavage rate was recorded as the percentage of fertilized oocytes that underwent at least one cleavage division. Both on day seven and day eight post-fertilization, the blastocyst rate was recorded as the percentage of fertilized oocytes that reached the blastocyst stage.
Categorization of cumulus and ooplasm morphology. Images were taken using a ToupCam camera connected to ToupView software (ToupTek, version 3.7.13270.20181102) on an inverted Olympus microscope. Categories were assigned to cumulus-oocyte complexes to distinguish for different parameters between cumulus (a–e) and ooplasm (f–h) morphology. Categories to define cumulus morphology were: “1”: cumulus consists of at least 5 layers, the cells are compact and dense (a); “2”: cumulus consists of at least 5 layers, the cells are less compact and start to expand (b); “3”: cumulus consists of at least 5 layers, cells are expanded (c), “4”: cumulus consists of less than 5 layers (d) and/or cells are not completely surrounding the oocyte (e). Categories to designate ooplasm morphology were: “A”: ooplasm is homogeneously dark (f); “B”: ooplasm is dark and slightly granular (g); “C”: ooplasm is a heterogeneous mix of dark and pale areas (h).
Survey
A link to the survey was sent by email to 20 institutions worldwide that practiced bovine IVP for commercial and/or research purposes and to laymen who never practiced IVP. In total, 163 persons participated in the survey of which 45 completed the entire questionnaire. The responses of only these 45 participants were considered for further analysis. Thirty-six participants from 11 IVF labs were experienced in working with bovine oocytes (further referred to as “experts”). The group of experts included seven participants with less than two years of experience, ten participants with 2–5 years, seven participants with 6–10 years and twelve participants with more than 10 years of experience. Among them, 27 persons studied bovine oocytes for research purposes, one person for clinical purpose only, six persons combined research and clinical work and two experts answered to be active not for clinical, nor research purposes. Also, nine participants indicated to have no experience with bovine oocytes (further referred to as “laymen”). As the survey was anonymous, no personal information (e.g. name of the participant or institution) was reported.
Prediction of COC development
Images of thirty oocytes were shown twice in random order and participants were asked whether or not they would select each oocyte for further in vitro processing, assuming that (1) it would have a 30% chance of success to develop into a blastocyst, (2) they had unlimited access to other oocytes, and (3) any oocyte that fails to develop into a transferable embryo result in a loss of both time and money. The images of COCs depicted in this questionnaire were chosen from exp. ‘1.2 Individual IVP’, so that the ground truth (i.e. stage of embryonic development at day eight post-fertilization) was known.
Ranking of morphological parameters
Participants were asked which morphological parameters were considered the most important for the determination of COC quality. To do so, participants had to choose one answer out of the following list: cumulus cell morphology, ooplasm morphology, both are equally important, or other characteristics. In a second question, participants were asked to score five morphology parameters (i.e. number of layers of cumulus cells, density of the cumulus cells, color of the cumulus cells, color of the ooplasm, and homogeneity of the ooplasm) on a 5-point Likert scale with score 1: not important at all, score 5: very important.
Categorization of COC morphology
Participants were asked to categorize 30 COCs based on the morphology of the cumulus cells. Images of the COCs were shown twice and in random order. The categories from which participants could choose were the same as listed in ‘1.3 Collection of images and endpoint parameters’ and are depicted in Fig. 1. A link to exemplary images was provided. This question was repeated for ooplasm morphology parameters.
Artificial intelligence
Image segmentation and quantification
The method used was designed as a combination of image segmentation (for COC parameter quantification) and ML-based prediction (from 14 quantified oocyte parameters). The parameters quantified were:
-
Minimum, maximum, average and standard deviation of gray pixel values in the oocyte for each separate RGB channel. These parameters describe the variation of pixel intensities in the ooplasm and are directly correlated with contrast and sharpness.
-
Minimum, maximum, average and standard deviation of gray pixel values in the cumulus cells and zona pellucida for each separate RGB channel. These parameters describe the variation of pixel intensities in the cumulus cells and zona pellucida, and are directly correlated with contrast and sharpness.
-
Area, minimum radius and maximum radius of the oocyte. These parameters describe the shape and the size of the oocyte (Supplementary Figure S1). The minimum radius is the smallest found radius that connects the center of mass point and the border of the oocyte. The maximum radius is the largest found radius that connects the center of mass point and the border of the oocyte. Area is the surface of the oocyte mask (number of pixels times the size of the pixel).
-
Minimum, maximum, average and standard deviation of the distance from the border of the oocyte to the border of the cumulus cells, zona pellucida included. These parameters describe the shape and the size of the cumulus cells relative to the oocyte (Supplementary Figure S1).
In order to perform segmentation and to maintain all the information from the images, we decomposed the color images into their comprising RGB channels. We opted to use the extracted blue channel image to perform segmentation since it best represented the surface area (due to shallow penetration of the blue part of the light spectrum when compared to the green and red parts of the spectrum).
To allow automatic segmentation of the oocyte and cumulus cells, histogram analysis was performed by applying the method described by Babin and colleagues40. The result of the final segmentation can be seen in Supplementary Figure S2.
Validation and model selection
Deep neural network
A Modified National Institute of Standards and Technology (MNIST) type DNN was designed that takes quantified COC parameters as input and has 32 and 21 perceptrons in the two fully connected layers. The DNN was trained to classify input features as “reaches the blastocyst stage at day eight post-fertilization” (class 1) or “will not reach the blastocyst stage on day eight” (class 0) based on the results of experiment ‘1.2 Individual IVP’.
Data was balanced to a 60/40 ratio of class 0 and class 1 cases respectively for the data set of 687 segmented images. A total of 460 samples were used for training and 227 were used for in-training validation. The test data set was the data used in the survey (a total of 30 cases). Besides the network architecture, we experimented with multiple values for training batch size, number of epochs and used optimizers, to end up with ADAM optimizer41 with binary cross-entropy loss function, batch size of 8 trained for 620 epochs.
Random forest classifier
Besides a DNN, a RF was created to predict the oocytes’ developmental competence. This classifier was created for 32 input features, using a log2 number of random features in the learning process with 100 decision trees for training. Data were balanced to a 59/41 ratio of class 0 and class 1 cases respectively for the data set of 687 segmented images. The test data set consisted of the data used in the survey (total of 30 cases).
Statistical analysis
Statistical analyses regarding IVP and the survey were performed in R (version 4.2.1) and RStudio (2022.07.1 Build 554).
The effect of oocyte morphology on the developmental parameters (cleavage and blastocyst rates) was tested using a generalized linear mixed model fit by maximum likelihood. The replicate was set as a random effect and the categories of cumulus- and ooplasm morphology were set as fixed effects. Tukey’s post hoc test was used to assess the differences between the morphological categories. Results are expressed as least square means (LSM) and standard errors (SE). The significance level was set at p < 0.05.
A confusion matrix was composed to validate the prediction of oocyte development by participants and the ML models. From these confusion matrices sensitivity, specificity, positive predictive value (PPV), negative predictive value (NPV), accuracy and balanced accuracy (i.e. the arithmetic mean of sensitivity and specificity) were calculated. Accuracy was used as the most relevant parameter to discuss the balanced datasets (COCs presented to DNN and RF), while the balanced accuracy was considered most relevant for unbalanced data (COCs presented to experts and laymen in the survey). A receiver operator characteristics (ROC) curve was created to analyze the performance of the participants and the ML models in predicting oocyte development. Likert scale data were analyzed by calculating the median and interquartile ranges. The way participants categorized the COCs’ morphology was examined using inter- and intra-rater reliability scores. Inter-rater reliability was assessed using the Fleiss’ Kappa Coefficient statistical test. Intra-rater agreement was evaluated using the unweighted Cohen’s Kappa Coefficient statistical test. Kappa values (k) were interpreted as proposed by Landis and Koch42 (Table 1). The association between years of experience and intra-rater agreement was examined using the Pearson’s correlation coefficient.
Results
Individual in vitro embryo culture
Cumulus morphology is more important than ooplasm for blastocyst development
A total of 1095 COCs were used for individual IVP experiments. Labels were assigned to categorize cumulus- and ooplasm morphology by the same experienced researcher in the IVP lab. As for cumulus morphology, 597 COCs were assigned to cat. 1, 156 to cat. 2, 113 to cat. 3 and 229 to cat. 4. As for ooplasm morphology, 546 COCs were labeled as cat. A, 424 as cat. B and 125 as cat. C (Table 2).
Cumulus-oocyte complexes from cumulus cat. 1 resulted in a higher blastocyst yield at day eight post-fertilization compared to the other categories (p ≤ 0.0233; Table 2). This was also obvious at day seven post-fertilization, with substantially higher blastocyst rates for COCs belonging to cat. 1 compared to cat. 2 and 4 (p ≤ 0.0238) as shown in Table 2. Cumulus morphology did not influence the cleavage rates (p ≥ 0.2961). Ooplasm morphology did not affect cleavage or blastocyst rates (p ≥ 0.1832).
Survey
Prediction of development by experts and laymen shows limited accuracy
Images of immature COCs were presented to the participants of the survey (experts and laymen) along with the question whether each COC would develop into a blastocyst. Confusion matrices are shown in Fig. 2a and b and performance metrics are reported in Table 3. The average accuracy and balanced accuracy of the experts was lower than that of the laymen. Also, PPV, NPV and sensitivity were lower in the expert group than in the layman group, while specificity was higher for the experts than for the laymen.
Confusion matrices of experts (a), laymen (b), deep neural network (c) and random forest classifier (d) based on the same test samples (n = 30). Development score 1 = the cumulus-oocyte complex developed into a blastocyst, development score 0 = the cumulus-oocyte complex did not develop into a blastocyst at day eight post-fertilization.
Cumulus and ooplasm morphology are considered equally important by experts
Experts considered the importance of morphological parameters in judging the COCs’ developmental competence, as depicted in Fig. 3a and b. The majority of the experts weighed both cumulus cell and ooplasm morphology as equally important. Cumulus cell morphology as such was considered more important than ooplasm morphology. A minority of the experts selected “other”, where the shape and dimension of the oocyte and integrity of the zona pellucida were specified. Comprehensively, homogeneity of ooplasm resulted in the highest median Likert scale score (5, [3.25–5.00]), followed by the density of the cumulus cells (4, [3.25–5.00]), number of cumulus layers (4, [3.00–5.00]) and color of the ooplasm (4, [3.00–5.00]). The color of the cumulus cells was designated as least important (3, [2.00–4.00]) (Fig. 3b).
The importance of morphological parameters for evaluating oocyte quality according to experts (a,b) and artificial intelligence (c). (a) A global distinguishment between oocyte and cumulus cell morphology was made by experts. (b) Likert scale data of comprehensive morphology parameters are shown in boxplots, indicating median (center line) and interquartile ranges (boxes). Only the responses of the experts (n = 36) were considered. (c) A weight was given to 32 morphological characteristics of cumulus-oocyte complexes according to their importance in the random forest decision process. These morphological characteristics were then summarized into four categories: color of the ooplasm (n = 12), color of the cumulus (n = 12), size of the oocyte (n = 3) and size of the cumulus (n = 5). Boxplots show the median of each category (center line) and their responding interquartile ranges (boxes).
Assessing morphology results in a fair reliability and substantial repeatability
Participants were asked to assign a category for cumulus cell- and ooplasm morphology to the COCs presented in the survey. A fair overall inter-rater agreement was reported for both cumulus and ooplasm (k = 0.383 and k = 0.285 respectively).
When the participants were asked to rate the same oocytes for a second time, the mean intra-rater agreement was moderate for both cumulus and ooplasm (k = 0.595 ± 0.157 and 0.570 ± 0.151 respectively). Individual k-values for cumulus assessment ranged from 0.234 to 0.906. For ooplasm assessment, the individual k-values ranged from 0.181 to 0.895. There was no association between the years of experience and the level of intra-rater agreement (Pearson’s correlation r = 0.23 (p = 0.1337) and r = 0.09 (p = 0.57) for cumulus and ooplasm assessment respectively).
Artificial intelligence
Selection of machine learning models with the best accuracy
The best-performing model was chosen as the one achieving the highest accuracy during the in-training validation set (also known as the in-training test set, which constituted 227 samples), while also achieving the lowest loss for the in-training validation set (Fig. 4). Specifically, the training of the model was stopped at the epoch with the lowest loss function value for the in-validation data set. The shape of the training and validation loss functions shows that the model could generalize on the data. Despite training on hundreds of images, the accuracy and loss function curves have high noise levels, indicating that the classification of COC morphology is not a trivial matter and could benefit from including more data.
Accuracy and loss over 600 epochs for training (n = 460) and in-training validation (test) data (n = 227). Data was noisy, indicating hard learning cases. The loss curve stabilizes after a few hundred epochs, suggesting that the model has managed to generalize (learn) (b). This cannot be deducted from the accuracy curve (which is the reason for observing the loss) (a).
Deep neural network and random forest classifier predict blastocyst development with high accuracy
Images of immature COCs were presented to both ML models (DNN and RF), using the results of ‘1. Individual in vitro embryo production’ as ground truth. The best-scoring ML models demonstrated an average balanced accuracy of 79.3 and 71.2% (DNN and RF respectively). The normal accuracy was 80.0 and 73.0% for DNN and RF, respectively. Positive and negative predictive values, sensitivity and specificity were also higher in the DNN than in the RF model. Confusion matrices are shown in Fig. 2c and d and performance metrics are reported in Table 3.
The performance of the ML models (DNN and RF models corresponding to the best achieved accuracy) and humans (averages of experts and laymen) were analyzed using ROC curves (Supplementary Figure S3). The area under the ROC curve (AUC) was higher for the ML models (DNN: 79.4% [95% CI: 76.5–82.3%] and RF: 71.4% [95% CI: 67.9–74.9%]) compared to the humans (experts: 42.9% [95% CI: 40.7–45.2%], laymen: 44.3% [95% CI: 39.7–48.9%]).
Size of the cumulus-oocyte complex is most decisive according to random forest classification
Morphological COC characteristics were broken down into 32 parameters, which were related to color of the ooplasm (n = 12), color of the cumulus (n = 12), size of the oocyte (n = 3) and size of the cumulus (n = 5). A weight was provided to these parameters according to their importance in the RF decision process (Supplementary Table S1). The size of the oocyte had the highest median weight (0.0370 [0.0345–0.0425]), followed by the size of the cumulus (0.030 [0.030–0.031]) and the color of the ooplasm (0.0275 [0.0250–0.0355], Fig. 3c). The color of the cumulus had the lowest median weight (0.0270 [0.0230–0.0338], Fig. 3c) The decision tree that provided the best prediction result is depicted in Supplementary Figure S4.
Discussion
In the present study, a unique dataset was created culturing 1095 COCs individually in vitro, allowing follow-up from immature COC to an eventual embryo at day eight post-fertilization. This dataset allowed us to (1) explore the accuracy of predictions for blastocyst development made by humans (both experts and laymen) and AI; and (2) study the influence of different morphological parameters on oocyte quality in terms of blastocyst development. Two ML models were developed, RF and DNN, that predicted blastocyst development with higher accuracy than experts did. In addition, this study polled which characteristics of the COC were vital for the evaluation of developmental competence, showing different appreciations between observations from IVP experiments in individual culture (number of cumulus layers and cumulus density), experts (homogeneity of ooplasm) and our RF model (oocyte size).
The ML models developed in this study predicted whether an immature COC has the capacity to develop into a blastocyst or not. The DNN predicted blastocyst development with a balanced accuracy of 79.3% and AUC of 0.794 and scored slightly better than the RF model (balanced accuracy of 71.2% and AUC of 0.714). When the same dataset was presented to the models and the experts, the DNN model outperformed the experts with a 36.4% increased balanced accuracy while the RF model had a 28.3% increased balanced accuracy compared to the experts (experts’ balanced accuracy: 42.9% and AUC: 0.429). However, the best-performing models were selected, while the performance of the experts was the average result of all experts. Nevertheless, the performance of the experts in the present study (balanced accuracy of 42.9%, from images of immature COCs in an unbalanced dataset) is comparable to the previous findings of Nayot and colleagues, who reported that experts predict blastocyst development with an accuracy of 52.2% (from images of mature oocytes in a balanced dataset)30. Against our expectations, having experience in working with bovine oocytes did not improve the performance in predicting developmental competence, as laymen performed slightly better than the experts on all metrics included in this study, except for specificity (48.5 and 46.0% for experts and laymen, respectively). It should be noted that the group with laymen (n = 9) was smaller than the group of experts (n = 36), making its average performance more prone to extreme values. But even within the group of experts, performance did not increase with the number of years of experience. The relatively low performance of the experts compared to ML models (both in our as in other studies30), may be due to subtle deviations that are not visible to the human eye while having a glance through the microscope. Yet these deviations can be noted by the ML models, for example, merely a few micrometers difference in the size of the oocyte. The performance of the ML models surpassing the performance of human raters highlights the potential added value of AI in the embryology lab.
The performance of our ML models outperforms similar models created in human embryology studies30,31,32,43. The VIOLET oocyte assessment model, based on a convoluted neural network, predicts human blastocyst development with an accuracy of 62.8%30. A similar model developed by the same group resulted in an AUC of 0.64 for predicting blastocyst development, which could be enhanced to 0.67 when prediction-making was combined with automatic segmentation31. Likewise, a neural classifier, created by González and colleagues, predicted blastocyst development with an accuracy of 60% and AUC of 0.6243. In addition, the AI model developed by Hall and colleagues predicted blastocyst development with an AUC of 0.7732. This performance was surpassed by our DNN model, which demonstrated an AUC of 0.79.A study in mice used in vitro maturation time-lapse data as input for a mathematical classification tool (feed-forward artificial neural network) and predicted with 91.03% accuracy whether the oocyte was developmentally competent44. The aforementioned human and mouse models used images of oocytes without cumulus as input and are therefore not suitable for application in bovine IVP, as the cumulus should surround the oocyte until in vitro fertilization is finished in routine bovine IVP practice. Moreover, IVP in human is generally performed in subfertile couples, with several clinical factors other than oocyte morphology, like age, affecting results. In cattle, IVP donor animals are mostly young and fertile and selection is based on genetic potential. In cattle, a ML model was created using images of expanding COCs during in vitro maturation, aiming to predict the timing of nuclear maturation34. However, no significant association was found between the predicted nuclear maturation and embryo development34. The ML models created in the present study are the first to predict the developmental competence of immature oocytes surrounded by their cumulus, with an AUC higher than similar models in human medicine.
For both human assessment and AI, the NNP was greater than the PPV, meaning that prediction of failure – likely associated with aberrant COC morphology – was better than prediction of success. Even a COC with good morphology may fail to become a blastocyst, indicating the effect of extrinsic factors. Evidently, other factors besides oocyte quality are involved in the success of IVP such as semen quality and culture conditions. In this study, we wanted to focus on the oocyte quality itself aiming to learn more about the relationship between oocyte morphology and blastocyst formation. To do so, we attempted to limit possible confounding factors by working as systematically as possible. However, this approach is also an important limitation of the study and future research should focus on cultivating oocytes in different conditions. The inclusion of more diverse COC images, obtained from multiple external laboratories, is necessary to confirm our results and to validate our models in the future.
Different laboratories, and even different staff members, employ different criteria to select oocytes for further in vitro processing. We studied which characteristics were most vital to oocyte quality in terms of blastocyst development. This was performed on three levels: (1) by evaluating the results of individual IVP; (2) through a survey among experts; (3) by extracting the most important features in the decision-making process from the RF model.
The results from our individual IVP experiments demonstrate the importance of a compact cumulus and a sufficient number of cumulus layers, as the COCs from cat. 1 (“cumulus consists of at least 5 layers, the cells are compact and dense”) had significantly higher blastocyst rates at day eight post-fertilization than the other groups. Cumulus-oocyte complexes with slight expansion (cat. 2), full expansion (cat. 3) or less than 5 layers of cumulus cells or incomplete cumulus (cat. 4) had lower blastocyst rates, but still managed to progress to the blastocyst stage. No significant differences in development were noted between these three groups in our study. Yet, Blondin and Sirard11 reported a difference in embryo development between the aforementioned groups, as oocytes with slightly expanded cumulus had a higher number of embryos reaching the morula stage compared to oocytes with a fully expanded cumulus or with one or no cumulus layer in individual culture11. According to the experts, the density of the cumulus and the number of cumulus layers were considered the second most important feature. This is supported by the general perception that a healthy oocyte is an oocyte surrounded by multiple layers of compact cumulus cells45. Similarly, the importance of cumulus density and the number of cumulus layers was extracted from our RF model, as the size of the cumulus was ranked as the second most important feature contributing to oocyte quality. While the number of cumulus layers is associated with developmental competence, this parameter may be affected by the oocyte collection method. In clinical practice, oocytes are collected by transvaginal ultrasound-guided follicle aspiration. This technique requires long aspiration lines and high vacuum pressure, causing a higher loss of cumulus cells, when compared to post mortem aspiration with a needle and syringe, as performed in this study. Further research is necessary to examine this aspect in clinical practice.
To explore the weight of the various morphological characteristics, we asked experts about their perception of a qualitative oocyte. The majority of experts reported that cumulus and ooplasm are both equally important and placed homogeneity of the ooplasm on the top of the most important features. However, the significance of ooplasm homogeneity and -granularity is often debated in literature. For example, oocytes with heterogeneous ooplasm had higher cleavage rates compared to oocytes with homogeneous ooplasm, which was attributed to a lower incidence of polyspermy in the heterogeneous ooplasm group46. This study reported no differences in blastocyst rates between oocytes with a homogeneous and heterogeneous ooplasm46. Conversely, Bilodeau-Goessels and colleagues demonstrated that oocytes with homogeneous and granulated ooplasm had no significant difference in cleavage rate, although blastocyst formation was reduced in the granulated ooplasm group45. A more recent study showed that granularity of the ooplasm did not affect embryo development, fetal development, or calving rate47. This is consistent with our results from the IVP experiments, as ooplasm morphology had no significant effect on cleavage- or blastocyst rates. Similarly, our RF model ranked the color of the ooplasm (i.e. the general feature to which granularity is included) as third out of four features contributing to oocyte developmental competence.
The size of the oocyte was identified by our RF model to have the highest predictive power regarding blastocyst development. Bovine oocytes with a diameter of 110–120 μm have the highest potential to reach nuclear maturation12, while the highest developmental competence was obtained in oocytes with a diameter of ≥ 120 μm48,49,50,51. As the importance of oocyte size was univocally confirmed in earlier studies12,48,49,50,51, this characteristic was not considered in the IVP experiments, nor was it incorporated into the questioning of the survey.
The cumulus is of superior importance compared to the ooplasm according to our results of both IVP experiments and analysis of the RF model. On the contrary, ooplasm homogeneity was pointed out by the experts of the survey as the most pivotal characteristic to determine oocyte quality. Also in literature, opinions vary between studies. This is probably attributed to the fact that different studies apply different criteria to categorize COCs and that most studies consider the cumulus and ooplasm together, making it difficult to compare results. The debate may also be driven by the subjective interpretation of the various morphological features, as evidenced in the present study. The reliability of morphology assessment by humans was examined by checking how the participants of the survey interpreted different morphological features of the cumulus and ooplasm. Interpretations regarding cumulus and ooplasm assessment were overall fair. This overall fair inter-rater agreement exposes the limited reliability of morphology assessment. Likewise, the intra-rater agreement was moderate for both cumulus and ooplasm assessment. The raters were mildly consequent in their morphology evaluation, demonstrating a rather inadequate repeatability. No association was reported between the level of intra-rater agreement (k) and the years of experience, emphasizing the complexity of cumulus and ooplasm assessment. In addition, morphology assessment is extremely vulnerable to observer bias, as evidenced by the wide range of individual kappa scores.
Conclusion
Improving the efficiency of IVP and selecting the best embryos prior to transfer starts with a proper assessment of oocyte quality. We followed the embryonic development of > 1,000 immature bovine COCs individually up to eight days post-fertilization. With this dataset we created a ML model that predicts oocytes’ potential to develop into a blastocyst with 36.4% improved balanced accuracy compared to embryologists. Moreover, we demonstrated that cumulus density and the number of cumulus layers contribute more to developmental competence than granulation of the ooplasm. We also scrutinized the subjective nature of morphology assessment by visual inspection, as our results reveal limited validity of COC morphology assessment by humans.
Data availability
The datasets used and analyzed during the current study are available from the corresponding author upon reasonable request.
Abbreviations
- AI:
-
Artificial intelligence
- AUC:
-
Area under the ROC curve
- BSA:
-
Bovine serum albumin
- COC:
-
Cumulus-oocyte complex
- dpi:
-
Days post-insemination
- DNN:
-
Deep neural network
- hpi:
-
Hours post-insemination
- IVP:
-
In vitro embryo production
- k:
-
Kappa coefficient
- LSM:
-
Least square means
- ML:
-
Machine learning
- NPV:
-
Negative predictive value
- PPV:
-
Positive predictive value
- RF:
-
Random forest classifier
- ROC:
-
Receiver operator characteristics
- SE:
-
Standard errors
- SOF:
-
Synthetic oviductal fluid
- TALP:
-
Tyrode’s albumin lactate pyruvate
- TCM:
-
Tissue culture medium
References
Viana, J. H. M. 2022 Statistics of embryo production and transfer in domestic farm animals. Embryo Technol. Newsl. 41(4) (2023).
Wrenzycki, C. Parameters to identify good quality oocytes and embryos in cattle. Reprod. Fertil. Dev. 34(2), 190–202 (2021).
Demetrio, D. G. B. et al. How can we improve embryo production and pregnancy outcomes of Holstein embryos produced in vitro? (12 years of practical results at a California dairy farm). Anim. Reprod. 17(3) (2020).
Rizos, D., Ward, F., Duffy, P., Boland, M. P. & Lonergan, P. Consequences of bovine oocyte maturation, fertilization or early embryo development in vitro versus in vivo: implications for blastocyst yield and blastocyst quality. Mol. Reprod. Dev. 61(2), 234–248 (2002).
de Loos, F., van Vliet, C., van Maurik, P. & Kruip, T. A. M. Morphology of immature bovine oocytes. Gamete Res. 24(2), 197–204. https://doi.org/10.1002/mrd.1120240207 (1989).
Hazeleger, N. L., Hill, D. J., Stubbing, R. B. & Walton, J. S. Relationship of morphology and follicular fluid environment of bovine oocytes to their developmental potential in vitro. Theriogenology 43(2), 509–522 (1995).
Boni, R., Cuomo, A. & Tosti, E. Developmental potential in bovine oocytes is related to cumulus-oocyte complex grade, calcium current activity, and calcium stores. Biol. Reprod. 66(3), 836–842 (2002).
Nagano, M., Katagiri, S. & Takahashi, Y. Relationship between bovine oocyte morphology and in vitro developmental potential. Zygote 14(1), 53–61 (2006).
Santos, P., Chaveiro, A., Simões, N. & Moreira Da Silva, F. Bovine oocyte quality in relation to ultrastructural characteristics of Zona pellucida, polyspermic penetration and developmental competence. Reprod. Domest. Anim. 43(6), 685–689. https://doi.org/10.1111/j.1439-0531.2007.00970.x (2008).
Leibfried, L. & First, N. L. Characterization of bovine follicular oocytes and their ability to mature in vitro. J. Anim. Sci. 48(1), 76–86. https://doi.org/10.2527/jas1979.48176x (1979).
Blondin, P. & Sirard, M. A. Oocyte and follicular morphology as determining characteristics for developmental competence in bovine oocytes. Mol. Reprod. Dev. 41(1), 54–62. https://doi.org/10.1002/mrd.1080410109 (1995).
Fair, T., Hyttel, P. & Greve, T. Bovine oocyte diameter in relation to maturational competence and transcriptional activity. Mol. Reprod. Dev. 42(4), 437–442 (1995).
Maside, C. et al. Oocyte morphometric assessment and gene expression profiling of oocytes and cumulus cells as biomarkers of oocyte competence in sheep. https://doi.org/10.3390/ani11102818 (2021).
De Wit, A. A. C., Wurth, Y. A. & Kruip, T. A. M. Effect of ovarian phase and follicle quality on morphology and developmental capacity of the bovine cumulus-oocyte complex. J. Anim. Sci. 78(5), 1277–1283 (2000).
Leroy, J. L. M. R., Genicot, G., Donnay, I. & Van Soom, A. Evaluation of the lipid content in bovine oocytes and embryos with nile red: a practical approach. Reprod. Domest. Anim. 40(1), 76–78. https://doi.org/10.1111/j.1439-0531.2004.00556.x (2005).
Koester, M. et al. Evaluation of bovine Zona pellucida characteristics in polarized light as a prognostic marker for embryonic developmental potential. Reproduction 141(6), 779–787 (2011).
Alm, H. et al. Bovine blastocyst development rate in vitro is influenced by selection of oocytes by brillant Cresyl blue staining before IVM as indicator for glucose-6-phosphate dehydrogenase activity. Theriogenology 63(8), 2194–2205 (2005).
Tomari, H. et al. Meiotic spindle size is a strong indicator of human oocyte quality. Reprod. Med. Biol. 17(3), 268 (2018).
Hu, J. et al. First Polar body morphology affects potential development of Porcine parthenogenetic embryo in vitro. Zygote 23(4), 615–621 (2015).
Zhou, W., Fu, L., Sha, W., Chu, D. & Li, Y. Relationship of polar bodies morphology to embryo quality and pregnancy outcome. Zygote 24(3), 401–407 (2016).
Nandi, S., Ravindranatha, B. M., Gupta, P. S. P. & Sarma, P. V. Timing of sequential changes in cumulus cells and first Polar body extrusion during in vitro maturation of Buffalo oocytes. Theriogenology 57(3), 1151–1159 (2002).
Dominko, T. & First, N. L. Timing of meiotic progression in bovine oocytes and its effect on early embryo development. 47, 456–467 (1997).
Dessie, S. W. et al. Dielectrophoretic behavior of in vitro-derived bovine metaphase II oocytes and zygotes and its relation to in vitro embryonic developmental competence and mRNA expression pattern. Reproduction 133(5), 931–946 (2007).
Martínez-Moro, Á. et al. RNA-sequencing reveals genes linked with oocyte developmental potential in bovine cumulus cells. Mol. Reprod. Dev. 89(9), 399–412 (2022).
Dieci, C. et al. Differences in cumulus cell gene expression indicate the benefit of a pre-maturation step to improve in-vitro bovine embryo production. Mol. Hum. Reprod. 22(12), 882–897 (2016).
Scarica, C., Cimadomo, Giancani, L., Stoppa, A., Capalbo, M. & D, Dovere & & & An integrated investigation of oocyte developmental competence: expression of key genes in human cumulus cells, morphokinetics of early divisions, blastulation, and euploidy. J. Assist. Reprod. Genet. 36, 875–887. https://doi.org/10.1007/s10815-019-01410-3 (2019).
Zhou, T. P. et al. Expression of target genes in cumulus cells derived from human oocytes with and without blastocyst formation. Reproductive Dev. Med. 3(2), 84–88 (2019).
Güell, E. Criteria for implementing artificial intelligence systems in reproductive medicine. Clin. Exp. Reprod. Med. 51(1), 1–12 (2024).
Hanassab, S. et al. The prospect of artificial intelligence to personalize assisted reproductive technology. Krasimira Tsaneva-Atanasova. https://doi.org/10.1038/s41746-024-01006-x (2024).
Nayot, D., Meriano, J., Casper, R. & Alex, K. An oocyte assessment tool using machine learning: Predicting blastocyst development based on a single image of an oocyte. Hum. Reprod. 35, i129–30 (2020).
Fjeldstad, J. et al. An artificial intelligence tool predicts blastocyst development from static images of fresh mature oocytes. Reprod. Biomed. Online. 48(6), 103842 (2024).
Hall, J. M. M. et al. Use of federated learning on distributed data to develop an artificial intelligence for predicting usable blastocyst formation from pre-ICSI oocyte images. Reprod. Biomed. Online 104403. (2024).
Letort, G. et al. An interpretable and versatile machine learning approach for oocyte phenotyping. J. Cell. Sci. https://doi.org/10.1242/jcs.260281 (2022).
Chia-Tang Ho, T., Kawate, N. & Koyama, K. Predicting nuclear maturation speed of oocytes from Japanese black beef heifers through non-invasive observations during IVM: an attempt using machine learning algorithms. https://doi.org/10.1016/j.theriogenology.2023.07.007 (2023).
McLennan, H. J., Saini, A., Dunning, K. R. & Thompson, J. G. Oocyte and embryo evaluation by AI and multi-spectral auto-fluorescence imaging: livestock embryology needs to catch-up to clinical practice. Theriogenology 150, 255–262 (2020).
Bortoliero Costa, C., Fair, T. & Seneda, M. M. Environment of the ovulatory follicle: modifications and use of biotechnologies to enhance oocyte competence and increase fertility in cattle. Animal. 17 (2023).
Raes, A. et al. Manual versus deep learning measurements to evaluate cumulus expansion of bovine oocytes and its relationship with embryo development in vitro. Comput. Biol. Med. 168, 107785 (2024).
Wurth, Y. & Kruip, T. A. M. Bovine embryo production in vitro after selection of the follicles and oocyte. In Proceedings of the 12th International Congress of Animal Reproduction (ICAR); August 23–27, The Hague, The Netherlands, 387–389 (1992).
Kakkassery, M. P., Vijayakumaran, V. & Sreekumaran, T. Effect of cumulus oocyte complex morphology on in vitro maturation of bovine oocytes. J. Veterinary Anim. Sci. (2010).
Babin, D. et al. Robust segmentation methods with an application to aortic pulse wave velocity calculation. Comput. Med. Imaging Graph. 38(3), 179–189 (2014).
Kingma, D. P. & Ba, J. Adam: A Method for Stochastic Optimization. (2014).
Landis, J. R. & Koch, G. G. The measurement of observer agreement for categorical data. Biometrics 33(1), 159 (1977).
Sánche González, D., Flores-Saiffe, A., Valencia-Murillo, R., Mendizabal-Ruiz, G. & Chavez-Badiol, M. B. D A. Machine learning predicting oocyte’s fertilization and blastocyst potential based on morphological features. In Abstracts of the 37th Annual Meeting of the ESHRE, 26 June to 1 July 2021, i246–7. (2021).
Cavalera, F. et al. A neural Network-Based identification of developmentally competent or incompetent mouse fully-grown oocytes. J. Vis. Exp. 2018(133) (2018).
Bilodeau-Goeseels, S. & Panich, P. Effects of oocyte quality on development and transcriptional activity in early bovine embryos. Anim. Reprod. Sci. 71(3–4), 143–155 (2002).
Nagano, M., Takahashi, Y. & Katagiri, S. In vitro fertilization and cortical granule distribution of bovine oocytes having heterogeneous ooplasm with dark clusters.
Rosa, P. M. D. S., Guedes, P. H. E., Garcia, J. M. & Oliveira, C. S. Cytoplasmic granules in bovine oocytes do not affect embryonic or fetal development. Zygote ;32(1) (2024).
Otoi, T., Yamamoto, K., Koyama, N., Tachikawa, S. & Suzuki, T. Bovine oocyte diameter in relation to developmental competence. Theriogenology 48(5), 769–774 (1997).
Rahman, M. B. et al. Oocyte quality determines bovine embryo development after fertilisation with hydrogen peroxide-stressed spermatozoa. Reprod. Fertil. Dev. 24(4), 608–618 (2012).
Anguita, B., Vandaele, L., Mateusen, B., Maes, D. & Van Soom, A. Developmental competence of bovine oocytes is not related to apoptosis incidence in oocytes, cumulus cells and blastocysts. Theriogenology 67(3), 537–549 (2007).
Vandaele, L., Mateusen, B., Maes, D. G. D., de Kruif, A. & Van Soom, A. Temporal detection of caspase-3 and -7 in bovine in vitro produced embryos of different developmental capacity. Reproduction 133(4), 709–718 (2007).
Acknowledgements
The authors thank Dr. Bert Damiaans for carefully reading the manuscript.
Funding
AR is supported by the Special Research Fund (BOF) of Ghent University, project number 01D12519. OBP is supported by the Ad Astra Fellowship of the University College Dublin. This research was supported by the European Union’s Horizon 2020 research and innovation program under the Marie Skłodowska-Curie grant agreement No 860960 and by Bijzonder Onderzoeksfonds GOA (Geconcerteerde onderzoeksacties) 2018000504 (GOA030-18 BOF).
Author information
Authors and Affiliations
Contributions
Conceptualization: A.R., D.B., O.B.P. and A.V.S. Collection of data: A.R. Design of AI models and analysis: D.B. Statistical analysis: A.R., D.B. and O.B.P. Interpretation of results and writing of the manuscript: A.R. and D.B. Review and editing: K.S., O.B.P., A.V.S. and G.O. Supervision: K.S., A.V.S. and G.O. All authors reviewed the manuscript.
Corresponding authors
Ethics declarations
Competing interests
The authors declare no competing interests.
Additional information
Publisher’s note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Electronic supplementary material
Below is the link to the electronic supplementary material.
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License, which permits any non-commercial use, sharing, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if you modified the licensed material. You do not have permission under this licence to share adapted material derived from this article or parts of it. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by-nc-nd/4.0/.
About this article
Cite this article
Raes, A., Babin, D., Pascottini, O.B. et al. Artificial intelligence outperforms humans in morphology-based oocyte selection in cattle. Sci Rep 15, 21829 (2025). https://doi.org/10.1038/s41598-025-09019-6
Received:
Accepted:
Published:
Version of record:
DOI: https://doi.org/10.1038/s41598-025-09019-6
Keywords
This article is cited by
-
Applications of artificial intelligence in bovine reproductive assessment: focus on oocytes and blastocysts
Journal of Assisted Reproduction and Genetics (2025)






