Lung Cancer Detection by an Electronic Nose

From AIRWiki
Jump to: navigation, search

Part 1: project profile

Project name

Lung Cancer Detection by an Electronic Nose

Project description

The electronic nose is an instrument able to detect and recognize odors, that is the volatile substances in the atmosphere or emitted by the analyzed substance. This device can react to a gas substance by providing signals that can be analyzed to classify the input. It is composed of a sensor array (MOS sensors, in our case) and a pattern classification process based on machine learning techniques. Each sensor reacts in a different way to the analyzed substance, providing multidimensional data that can be considered as a unique olfactory blueprint of the analyzed substance. In our work, we used an array composed of six Metal Oxide Semiconductor (MOS) sensors. In this project, we have been using an electronic nose based on an array of six MOS sensors, to recognize the presence of lung cancer in breaths' subjects, diagnosing the disease with a non invasive and low cost method.

During a first pilot study of our research, we have evaluated the possibility and accuracy of lung cancer diagnosis by classifying the olfactory signal associated to exhalations of subjects. Results have been very satisfactory and promising: we achieved an average accuracy of 92.6%, sensitivity of 95.3% and specificity of 90.5%. In particular we analyzed the breath of 101 individuals, of which 58 control subjects, and 43 suffer from different types of lung cancer (primary and not) at different stages. In order to find the components able to discriminate between the two classes ‘healthy’ and ‘sick’ at best, and to reduce the dimensionality of the problem, we have extracted the most significant features and projected them into a lower dimensional space using Non Parametric Linear Discriminant Analysis. Finally, we have used these features as input to several supervised pattern classification algorithms, based on different k-nearest neighbors (k-NN) approaches (classic, modified and Fuzzy k-NN), linear and quadratic discriminant classifiers and on a feed-forward artificial neural network (ANN). The observed results have all been validated using cross-validation.

The achieved satisfactory results pushed us to begin a new study, in order to confirm the obtained promising results and to evaluate the ripetibility of our results. We analyzed 104 breath samples of 52 subjects, 22 healthy subjects and 30 subjects with primary lung cancer at different stages. The acquisition has been done inviting subjects to breath into a nalophan bag, later input into the electronic nose. In order to find the best statistical model able to discriminate between the two classes ‘healthy’ and ‘lung cancer’ subjects, and to reduce the dimensionality of the problem, we implemented a genetic algorithm (GA) that found the best combination of feature selection, feature projection and classifier. In particular, according to the feature selection issue, we considered methods based on exponential, sequential and randomized algorithms. Principal Component Analysis (PCA), Fisher’s Linear Discriminant Analysis (LDA) and Non Parametric Linear Discriminant Analysis (NPLDA) have been considered to project features into a lower dimensional space. Classification has been performed implementing several supervised pattern classification algorithms, based on different k-nearest neighbors (k-NN) approaches (classic, modified and fuzzy k-NN), on linear and quadratic discriminant classifiers and on a feed-forward artificial neural network (ANN). The best solution provided from the genetic algorithm, has been the projection of the found subset of features into a single component using the Fisher’s Linear Discriminant Analysis (LDA) and a classification based on the k-Nearest Neighbours (k-NN) method. Performing a Student’s t-test between all pair of considered models, no significative differences emerged, suggesting that all computational intelligence methods that we have applied provided satisfying results. The observed results, all validated using cross-validation, have been very satisfactory achieving an average accuracy of 96.2%, an average sensitivity of 93.3% and an average specificity of 100%, as well as very small confidence intervals. These results confirmed a previous pilot study where we achieved an average accuracy of 92.6%, sensitivity of 95.3% and specificity of 90.5% (on 58 control subjects and 43 lung cancer subjects). We also investigated the possibility of performing early diagnosis, building a model able to predict a sample belonging to a subject with primary lung cancer at stage I, compared to healthy subjects. Also in this analysis results have been excellent, achieving an average accuracy of 92.85%, an average sensitivity of 75.5% and an average specificity of 97.72%.

The research demonstrate that an instrument as the electronic nose, combined with the appropriate artificial intelligence techniques, is a promising alternative to current lung cancer diagnostic techniques: the obtained predictive errors are lower than those achieved by present diagnostic methods, and the cost of the analysis, both in money, time and resources, is lower. Moreover, the instrument is completely non invasive. The introduction of this technology will lead to very important social and business effects: its low price and small dimensions allow a large scale distribution, giving the opportunity to perform non invasive, cheap, quick, and massive early diagnosis and screening.

Dates

Start date: 2007/01/01

End date: --

Website(s)

At the moment no website avaible

People involved

Project head(s)

A. Bonarini - Andrea Bonarini

M. Matteucci - Matteo Matteucci

PhD Students

R. Blatt - Rossella Blatt

Students currently working on the project

Michele Valsecchi - Michele Valsecchi

Claudio Trameri - Claudio Trameri

Mauro Verdirosa - Mauro Verdirosa

Students who worked on the project in the past
External personnel:

Dott. Ugo Pastorino (Istituto dei Tumori - Milano)

Dott. Elisa Calabrò (Istituto dei Tumori - Milano)

Laboratory work and risk analysis

Laboratory work for this project will be mainly performed at the Istituto Nazionale dei Tumori di Milano, where the acquisistion of subjects' breath, both sick and healthy will be done. For this kind of work, there are not potential risks.

Part 2: project description

Link to project documents and files

Results obtained from this work have been presented at different conferences:

  • Prestigious Applications of Intelligent Systems (PAIS 2008), Patras, Greece
The 5th Prestigious Applications of Intelligent Systems (PAIS 2008) is a sub-conference of the 18th European Conference on Artificial Intteligence (ECAI 2008) that will be held at the University of Patras, Greece, from July 21st to 25th.
File:PAIS.pdf
  • International Joint Conference on Neural Networks (IJCNN 2007), Orlando, FL, USA
Lung Cancer Identification by an Electronic Nose based on array of MOS Sensors, Blatt Rossella, Bonarini Andrea, Calabrò Elisa, Della Torre Matteo, Matteucci Matteo, Pastorino Ugo. Proceedings of the 2007 International Joint Conference on Neural Networks (IJCNN 2007), Orlando, FL, USA: File:IJCNNfinal.pdf
Short presentation of the Lung Cancer Identification by an Electronic Nose based on an array of MOS Sensors paper: File:LungCancerIdentificationIJCNN2007.pdf
  • International Workshop on Fuzzy Logic and Applications (WILF 2007), Ruta di Camogli, Genova, Italy
Fuzzy k-NN Lung Cancer Identification by an Electronic Nose, Blatt Rossella, Bonarini Andrea, Calabrò Elisa, Della Torre Matteo, Matteucci Matteo, Pastorino Ugo. Proceedings of the 7th International Workshop on Fuzzy Logic and Applications, WILF 2007, Lecture Notes in Computer Science (LNAI), LNAI 4578, pages 261-268, Springer. Camogli (GE), Italy, July 2007.