Elsevier

Expert Systems with Applications

Volume 39, Issue 16, 15 November 2012, Pages 12564-12573
Expert Systems with Applications

Predicting seminal quality with artificial intelligence methods

https://doi.org/10.1016/j.eswa.2012.05.028Get rights and content

Abstract

Fertility rates have dramatically decreased in the last two decades, especially in men. It has been described that environmental factors, as well as life habits, may affect semen quality. Artificial intelligence techniques are now an emerging methodology as decision support systems in medicine.
In this paper we compare three artificial intelligence techniques, decision trees, Multilayer Perceptron and Support Vector Machines, in order to evaluate their performance in the prediction of the seminal quality from the data of the environmental factors and lifestyle.
To do that we collect data by a normalized questionnaire from young healthy volunteers and then, we use the results of a semen analysis to asses the accuracy in the prediction of the three classification methods mentioned above.
The results show that Multilayer Perceptron and Support Vector Machines show the highest accuracy, with prediction accuracy values of 86% for some of the seminal parameters. In contrast decision trees provide a visual and illustrative approach that can compensate the slightly lower accuracy obtained.
In conclusion artificial intelligence methods are a useful tool in order to predict the seminal profile of an individual from the environmental factors and life habits. From the studied methods, Multilayer Perceptron and Support Vector Machines are the most accurate in the prediction. Therefore these tools, together with the visual help that decision trees offer, are the suggested methods to be included in the evaluation of the infertile patient.

Highlights

Male fertility has decreased in part due to environmental factors and life habits. ► Laboratory approach is the usual although expensive procedure to assess semen quality. ► We compare three AI methods as an alternative to predict male fertility. ► We obtain a prediction accuracy of 86% from environmental factors and lifestyle. ► The efficiency and clearness of these three methods suggests their clinical use.

Introduction

After the publication of a meta-analysis directed by Elisabeth Carlsen (Carlsen, Giwercman, Keiding, & Skakkebaek, 1992), there is a debate about a possible decline in semen quality. Numerous studies show a decrease in semen parameters during the past two decades (Auger et al., 1995, Splingart et al., 2011, Swan et al., 1997, Swan et al., 2000), affecting the male fertility potential. Between the factors considered to explain this decline, there is an increase in the incidence of male reproductive diseases (Irvine, 2000, Splingart et al., 2011), but also has been suggested the effect of environmental or occupational factors (Giwercman and Giwercman, 2011, Wong et al., 2003), to a certain lifestyle (Agarwal et al., 2008, Martini, 2004).
To evaluate the male partner, clinicians use the data obtained from semen analysis (Kolettis, 2003), and they compare the obtained results with the corresponding reference value established by World Health Organization (WHO, 1999). Semen analysis is a good predictor of the male fertility potential (Bonde et al., 1998, Guzick et al., 2001, Slama et al., 2002, Zinaman et al., 2000), and is also necessary to evaluate candidates to become semen donors (Barratt et al., 1998, Carrell et al., 2002, Ecochard et al., 1999, Society British Andrology, 1999). There is a high variability in the testicular function of an individual (Keel, 2006), so it is recommended to interpret the results taking in account certain factors (i.e., fever, toxic exposure) that can modify the semen parameters (Rowe & Comhaire, 2000).
In this paper, assuming the influence of environmental factors and life habits in semen quality, we compare the prediction accuracy of three different artificial intelligence (AI) methods, Multilayer Perceptron (MLP), Support Vector Machines (SVM) and decision trees (DT), to determine the best Decision Support Systems (DSS) that can help in the evaluation of male fertility potential.
The current advances and progresses in the field of AI have led to the emergence of expert systems and DSSs for economics, linguistics, management science, mathematical modelling, psychology, etc.
There are good classifiers in the AI such as artificial neural network (ANN), DT (Polat & Gnes, 2009b), SVM (Conforti & Guido, 2010) or even hybrid methods that combine ANNs and fuzzy logic into fuzzy neural networks (FNN) (Kahramanli & Allahverd, 2008) which are widely used to aid medical diagnosis by means of decision support systems construction.
Moreover, the use of AI has also become widely accepted in medical applications. These methods include advantages as (a) Ease of optimisation, resulting in cost-effective and flexible non-linear modelling of large data sets; (b) Accuracy for predictive inference, with potential to support clinical decision making; (c) These models can make knowledge dissemination easier by providing explanation, for instance, using rule extraction or sensitivity analysis (Lisboa & Taktak, 2006).
The amount of information used in order to predict the male fertility potential makes really useful the application with AI methods not only for improving the accuracy but also to select the best features as, very often, the number of feature to deal with is huge (Gil and Johnsson, 2010a, Gil and Johnsson, 2010b, Gil and Johnsson, 2011, Gil et al., 2009, Gil et al., 2011, Polat and Gnes, 2009a, Subashini et al., 2009). This will lead to the knowledge discovery in databases, data mining or the process of extracting patterns from large data sets.
The main objective of this paper is to compare a number of classification methods applied to male fertility data sets. The approach chosen to address the problem is by the use of different AI methods, in particular DT, MLP and SVM. A comparative study of those will give us insight into the merits of the different methods when used on this problem.
The remaining part of the paper is organized as follows: First we start defining the materials and methods of the study (study population and the variables of the questionnaire). Then we continue with a brief description of the AI methods used in this paper: MLP, SVM and DT. Then we proceed by describing the design of our proposal and the experiments carried out in the result section; the available data as well as a detailed explanation of the different values of our database; Then we continue by describing the subsequent testing carried out in order to analyse the results; Finally we draw the relevant conclusions.

Section snippets

Study population

The study was performed with young healthy volunteers among the students of the University of Alicante. 100 volunteers’ between 18 and 36 years old participated in the study. After being informed they were asked to provide a semen sample after 3 to 6 days of sexual abstinence, and a semen analysis according to World Health Organization (WHO) guidance was performed. Those with previous known reproductive alterations (i.e., Varicocele) were excluded of the statistical analysis.

Variables of the questionnaire

On the day of the

Semen parameters

The semen analysis procedures are standardized by a WHO publication (WHO, 1999). This publication establishes the reference value for the different semen parameters and the nomenclature (Eliasson et al., 1970) that describes the deviations from these reference values.
This nomenclature includes:
  • Normozoospermia: all values are over the lower limit.
  • Asthenozoospermia: percentage of progressively motile spermatozoa below the lower reference limit.
  • Oligozoospermia: total number of spermatozoa below

Performance

Frequently, the complete data set is divided into two subsets: the training set and the test set. Here, the training set is used to determine the system parameters, and the test set is used to evaluate the diagnosis accuracy and the network generalization. Cross-validation has been widely used to assess the generalization of a network. The cross-validation estimate of accuracy determining by the overall number of correct classifications divided by the total number of examples in the dataset.Acc

Discussion

In this paper, we have evaluated the performance of three AI methods, DT, MLP and SVM, and its application in the prediction of the male fertility potential. Based on the data obtained from 100 volunteers between 18 and 36 years we show the relationship of life habits and environmental factors with semen parameters.
The results have established that all these techniques achieve a high accuracy regarding the different measurement parameters. Specificity and positive predictive value present lower

Acknowledgment

We want to express our acknowledgement to the Ministry of Science and Innovation (Ministerio de Ciencia e Innovación - MICINN) through the “José Castillejo” program from Government of Spain as well as the Vicerrectorado de Innovación, University of Alicante, Spain (Vigrob-137) and to the Swedish Research Council through the Swedish Linnaeus project Cognition, Communication and Learning (CCL) as funders of the work exhibited in this paper.

References (76)

Cited by (115)

View all citing articles on Scopus
1
These authors equally contributed to this work.
View full text