9 December, 2025
Artificial intelligence in the IVF laboratory

Artificial intelligence (AI) is making significant inroads into healthcare, and assisted reproduction is no exception. In recent years, numerous software platforms have emerged that analyse images and data from oocytes or embryos, and in this article, we will explain how they can contribute to decision-making in the in vitro fertilisation laboratory.
An AI model serves as a ‘trained expert’ inside the computer: it learns by reviewing many real cases from the past (medical records, images, results, etc.) and extracts patterns to estimate probabilities in a new case. For example, using the patient’s age and a photo of an embryo, it can calculate the probability of achieving a pregnancy. It is not magic, nor does it predict the future: it provides estimates, not certainties.

And those estimates depend on the data used to train the model (which hospitals, which patients, the quality of the records, etc.) as well as the purpose for which it was designed. That is why not all models are the same, and different models may generate different results for the same patient, oocyte or embryo.
These platforms do not replace the assessment of the clinical team. That is why it is important to know how to interpret what AI tells us: a nice number or label is not enough. We need to understand how often that model is correct, where it makes mistakes, whether the probabilities it shows are realistic, and whether it works equally well in different centres and with different patients.
In this article, we will explain, in simple terms, how to interpret these results and when it is sensible to trust them. The idea is to use AI as a supporting tool, complementing the judgement of embryologists, doctors and patient preferences, in order to make informed and safe decisions.
How are prediction and classification models evaluated?
To determine whether a model is useful and safe, we measure its performance using various parameters (metrics). Although there are multiple platforms available on the market, not all models are equally reliable. If the model is not adequate, its ability to predict an outcome—in this case, the probability of pregnancy—is poor.
The most important evaluation metrics are:

Accuracy
The percentage of total correct predictions. This is a useful value, but it can be misleading if there is an imbalance in the outcomes. For example, if most embryos do not implant, correctly predicting ‘no’ many times increases the accuracy, without the model actually being good.

Sensitivity (recall or true positive rate)
Of all the embryos that would actually implant, how many does the model classify as ‘good’? If sensitivity is low, we ‘miss’ potentially viable embryos (false negatives).

Specificity (true negative rate)
Of all the actual negative cases, how many does it correctly discard? In the case of embryos that would not implant, how many does it identify as ‘low probability’? If the specificity is low, embryos that would not implant may ‘slip through’ (false positives).

Precision (or positive predictive value)
Of the cases that the model labels as ‘good’, how many are actually so? This is useful for avoiding false expectations if the positive class is rare.

F1-score
A single number that represents the balance between sensitivity and precision. This is useful when we aim to accurately detect good embryos while also avoiding the labelling of unsuitable embryos as good. The closer the score is to 1, the better the model.

AUC (area under the curve-ROC curves)
Measures the overall ability to distinguish between good and less good at different thresholds (the closer the AUC is to 1, the better). This value is useful for comparing models.

In addition to the previously mentioned metrics, there are other equally important parameters, such as explainability—being able to understand why the model suggests something—and external validation, which ensures that the model works in different medical centres and with different patient populations.
A quality model should be able to correctly classify the majority of embryos (groups of dotted lines).
Why is it important to evaluate a model thoroughly before using it in the laboratory?
Artificial intelligence can be an extraordinary ally, but only if used with prudence and expertise. Evaluating a model thoroughly ensures that its predictions are reliable and that they truly add value to the work of the team of doctors and embryologists. Technology is advancing rapidly, and this is welcome news: it offers us new tools to better understand embryos and make more informed decisions. However, the ultimate goal remains the same as always: to help create life with safety, care, and hope.
At HC Fertility, we are carefully evaluating the various artificial intelligence tools available, with plans to incorporate them into our laboratory operations in the near future. Our goal is to incorporate these tools only when we are confident that they provide real benefits and complement the assessments of our embryologists, without replacing their expertise or the human touch that characterises our approach. Because every decision in assisted reproduction carries immense emotional and human value.
Ariela Mata
Consultant in Clinical Embryology


Back to blog
In other news

19 June, 2020
How does the number of eggs decrease over the years?
The amount of a woman’s eggs decreases from the moment of birth, over the years, until menopau...
[Continue reading ]25 April, 2023
Single motherhood
In recent years we have seen society change in many ways. One of these changes has been the emergenc...
[Continue reading ]