Difference between revisions of "Lung Cancer Detection by an Electronic Nose"

From AIRWiki
Jump to: navigation, search
(Link to project documents and files)
 
(15 intermediate revisions by 3 users not shown)
Line 5: Line 5:
 
Lung Cancer Detection by an Electronic Nose
 
Lung Cancer Detection by an Electronic Nose
  
=== Project short description ===
+
=== Project description ===
  
The electronic nose is an instrument able to detect and recognize odors, that is the volatile substances in the atmosphere or emitted by the analyzed substance. This device can react to a gas substance by providing signals that can be analyzed to classify the input. It is composed of a sensor array (6 MOS sensors, in our case) and a pattern classification process based on machine learning techniques.  
+
The electronic nose is an instrument able to detect and recognize odors, that is the volatile substances in the atmosphere or emitted by the analyzed substance. This device can react to a gas substance by providing signals that can be analyzed to classify the input. It is composed of a sensor array (MOS sensors, in our case) and a pattern classification process based on machine learning techniques. Each sensor reacts in a different way to the analyzed substance, providing multidimensional data that can be considered as a unique olfactory blueprint of the analyzed substance. In our work, we used an array composed of six Metal Oxide Semiconductor (MOS) sensors.
 
In this project, we have been using an electronic nose based on an array of six MOS sensors, to recognize the presence of lung cancer in breaths' subjects, diagnosing the disease with a non invasive and low cost method.  
 
In this project, we have been using an electronic nose based on an array of six MOS sensors, to recognize the presence of lung cancer in breaths' subjects, diagnosing the disease with a non invasive and low cost method.  
  
During the first phase of our research, we have evaluated the possibility and accuracy of lung cancer diagnosis by classifying the
+
During a first pilot study of our research, we have evaluated the possibility and accuracy of lung cancer diagnosis by classifying the
olfactory signal associated to exhalations of subjects.  
+
olfactory signal associated to exhalations of subjects. Results have been very satisfactory and promising: we achieved an average accuracy of 92.6%, sensitivity of 95.3% and specificity of 90.5%. In particular we analyzed the breath of 101 individuals, of which 58 control subjects, and 43 suffer from
 
+
At the end of the first phase, results have been very satisfactory and promising: we achieved an average accuracy of 92.6%, sensitivity of
+
95.3% and specificity of 90.5%. In particular we analyzed the breath of 101 individuals, of which 58 control subjects, and 43 suffer from
+
 
different types of lung cancer (primary and not) at different stages.
 
different types of lung cancer (primary and not) at different stages.
 
In order to find the components able to discriminate between the two classes ‘healthy’ and ‘sick’ at best, and to reduce the dimensionality
 
In order to find the components able to discriminate between the two classes ‘healthy’ and ‘sick’ at best, and to reduce the dimensionality
Line 22: Line 19:
 
and on a feed-forward artificial neural network (ANN). The observed results have all been validated using cross-validation.  
 
and on a feed-forward artificial neural network (ANN). The observed results have all been validated using cross-validation.  
  
These results pushed us to begin the second phase of the project, still in progress, to investigate the possibility of early lung cancer diagnosis: we are involving a larger number of subjects, partioned in different classes according to the type and stage of the disease. The research demonstrates that the electronic nose is a promising alternative to current lung cancer diagnostic techniques: the obtained predictive errors are lower than those achieved by present diagnostic methods, and the cost of the analysis, both in money, time and resources, is lower. The introduction of this technology will lead to very important social and business effects: its low price and small dimensions allow a large scale distribution, giving the opportunity to perform non invasive, cheap, quick, and massive early diagnosis and screening.
+
The achieved satisfactory results pushed us to begin a new study, in order to confirm the obtained promising results and to evaluate the ripetibility of our results. We analyzed 104 breath samples of 52 subjects, 22 healthy subjects and 30 subjects with primary lung cancer at different stages. The acquisition has been done inviting subjects to breath into a nalophan bag, later input into the electronic nose. In order to find the best statistical model able to discriminate between the two classes ‘healthy’ and ‘lung cancer’ subjects, and to reduce the dimensionality of the problem, we implemented a genetic algorithm (GA) that found the best combination of feature selection, feature projection and classifier. In particular, according to the feature selection issue, we considered methods based on exponential, sequential and randomized algorithms. Principal Component Analysis (PCA), Fisher’s Linear Discriminant Analysis (LDA) and Non Parametric Linear Discriminant Analysis (NPLDA) have been considered to project features into a lower dimensional space. Classification has been performed implementing several supervised pattern classification algorithms, based on different k-nearest neighbors (k-NN) approaches (classic, modified and fuzzy k-NN), on linear and quadratic discriminant classifiers and on a feed-forward artificial neural network (ANN). The best solution provided from the genetic algorithm, has been the projection of the found subset of features into a single component using the Fisher’s Linear Discriminant Analysis (LDA) and a classification based on the k-Nearest Neighbours (k-NN) method. Performing a Student’s t-test between all pair of considered models, no significative differences emerged, suggesting that all computational intelligence methods that we have applied provided satisfying results. The observed results, all validated using cross-validation, have been very satisfactory achieving an average accuracy of 96.2%, an average sensitivity of 93.3% and an average specificity of 100%, as well as very small confidence intervals. These results confirmed a previous pilot study where we achieved an average accuracy of 92.6%, sensitivity of 95.3% and specificity of 90.5% (on 58 control subjects and 43 lung cancer subjects). We also investigated the possibility of performing early diagnosis, building a model able to predict a sample belonging to a subject with primary lung cancer at stage I, compared to healthy subjects. Also in this analysis results have been excellent, achieving an average accuracy of 92.85%, an average sensitivity of 75.5% and an average specificity of 97.72%.  
 +
 
 +
The research demonstrate that an instrument as the electronic nose, combined with the appropriate artificial intelligence techniques, is a promising alternative to current lung cancer diagnostic techniques: the obtained predictive errors are lower than those achieved by present diagnostic methods, and the cost of the analysis, both in money, time and resources, is lower. Moreover, the instrument is completely non invasive. The introduction of this technology will lead to very important social and business effects: its low price and small dimensions allow a large scale distribution, giving the opportunity to perform non invasive, cheap, quick, and massive early diagnosis and screening.
  
 
=== Dates ===
 
=== Dates ===
Line 37: Line 36:
 
===== Project head(s) =====
 
===== Project head(s) =====
  
A. Bonarini - [[User:AndreaBonarini]]
+
A. Bonarini - [[User:AndreaBonarini | Andrea Bonarini]]
  
M. Matteucci - [[User:MatteoMatteucci]]
+
M. Matteucci - [[User:MatteoMatteucci | Matteo Matteucci]]
  
===== Other Politecnico di Milano people =====
+
===== PhD Students =====
  
R. Blatt - [[User:RossellaBlatt]]
+
R. Blatt - [[User:RossellaBlatt | Rossella Blatt]]
  
 
===== Students currently working on the project =====
 
===== Students currently working on the project =====
  
Claudio Trameri - [[User:ClaudioTrameri]]
+
Michele Valsecchi - [[User:MicheleValsecchi | Michele Valsecchi]]
  
Mauro Verdirosa - [[User:MauroVerdirosa]]
+
Claudio Trameri - [[User:ClaudioTrameri | Claudio Trameri]]
 +
 
 +
Mauro Verdirosa - [[User:MauroVerdirosa | Mauro Verdirosa]]
  
 
===== Students who worked on the project in the past =====
 
===== Students who worked on the project in the past =====
Line 58: Line 59:
  
 
Dott. Elisa Calabrò (Istituto dei Tumori - Milano)
 
Dott. Elisa Calabrò (Istituto dei Tumori - Milano)
 
Dott. Matteo Della Torre (SACMI - Imola)
 
  
 
=== Laboratory work and risk analysis ===
 
=== Laboratory work and risk analysis ===
Line 68: Line 67:
 
== '''Part 2: project description''' ==
 
== '''Part 2: project description''' ==
  
=== State of the art ===
 
  
=== Preliminary and sketches ===
 
  
=== Design notes and guidelines ===
+
 
 +
 
 +
 
  
 
=== Link to project documents and files ===
 
=== Link to project documents and files ===
Line 80: Line 79:
 
* '''Prestigious Applications of Intelligent Systems (PAIS 2008), Patras, Greece'''  
 
* '''Prestigious Applications of Intelligent Systems (PAIS 2008), Patras, Greece'''  
 
:The 5th Prestigious Applications of Intelligent Systems (PAIS 2008) is a sub-conference of the 18th European Conference on Artificial Intteligence (ECAI 2008) that will be held at the University of Patras, Greece, from July 21st to 25th.  
 
:The 5th Prestigious Applications of Intelligent Systems (PAIS 2008) is a sub-conference of the 18th European Conference on Artificial Intteligence (ECAI 2008) that will be held at the University of Patras, Greece, from July 21st to 25th.  
:The presented paper will be soon available.
+
:[[Image:PAIS.pdf|Paper-PAIS2008]]
  
 
* '''International Joint Conference on Neural Networks (IJCNN 2007), Orlando, FL, USA'''
 
* '''International Joint Conference on Neural Networks (IJCNN 2007), Orlando, FL, USA'''
:'''Lung Cancer Identification by an Electronic Nose based on array of MOS Sensors''', Blatt Rossella, Bonarini Andrea, Calabrò Elisa, Della Torre Matteo, Matteucci Matteo, Pastorino Ugo. Proceedings of the 2007 International Joint Conference on Neural Networks (IJCNN 2007), Orlando, FL, USA: [[Special:IJCNNfinal.pdf|Paper-IJCNN2007]]  
+
:'''Lung Cancer Identification by an Electronic Nose based on array of MOS Sensors''', Blatt Rossella, Bonarini Andrea, Calabrò Elisa, Della Torre Matteo, Matteucci Matteo, Pastorino Ugo. Proceedings of the 2007 International Joint Conference on Neural Networks (IJCNN 2007), Orlando, FL, USA: [[Image:IJCNNfinal.pdf|Paper-IJCNN2007]]  
  
:Short presentation of the ''Lung Cancer Identification by an Electronic Nose based on an array of MOS Sensors'' paper: [[Special:LungCancerIdentificationIJCNN2007.pdf|Presentation-IJCNN2007]]
+
:Short presentation of the ''Lung Cancer Identification by an Electronic Nose based on an array of MOS Sensors'' paper: [[Image:LungCancerIdentificationIJCNN2007.pdf|Presentation-IJCNN2007]]
  
 
* '''International Workshop on Fuzzy Logic and Applications (WILF 2007), Ruta di Camogli, Genova, Italy'''
 
* '''International Workshop on Fuzzy Logic and Applications (WILF 2007), Ruta di Camogli, Genova, Italy'''
  
 
: '''Fuzzy k-NN Lung Cancer Identification by an Electronic Nose''', Blatt Rossella, Bonarini Andrea, Calabrò Elisa, Della Torre Matteo, Matteucci Matteo, Pastorino Ugo. Proceedings of the 7th International Workshop on Fuzzy Logic and Applications, WILF 2007, Lecture Notes in Computer Science (LNAI), LNAI 4578, pages 261-268, Springer. Camogli (GE), Italy, July 2007.
 
: '''Fuzzy k-NN Lung Cancer Identification by an Electronic Nose''', Blatt Rossella, Bonarini Andrea, Calabrò Elisa, Della Torre Matteo, Matteucci Matteo, Pastorino Ugo. Proceedings of the 7th International Workshop on Fuzzy Logic and Applications, WILF 2007, Lecture Notes in Computer Science (LNAI), LNAI 4578, pages 261-268, Springer. Camogli (GE), Italy, July 2007.
 
=== Description and results of experiments ===
 
 
=== Photos and videos ===
 
 
=== Link to source code of the software written for the project ===
 
 
=== Description and results of experiments ===
 
 
=== Useful internet links ===
 

Latest revision as of 19:26, 26 October 2009

Part 1: project profile

Project name

Lung Cancer Detection by an Electronic Nose

Project description

The electronic nose is an instrument able to detect and recognize odors, that is the volatile substances in the atmosphere or emitted by the analyzed substance. This device can react to a gas substance by providing signals that can be analyzed to classify the input. It is composed of a sensor array (MOS sensors, in our case) and a pattern classification process based on machine learning techniques. Each sensor reacts in a different way to the analyzed substance, providing multidimensional data that can be considered as a unique olfactory blueprint of the analyzed substance. In our work, we used an array composed of six Metal Oxide Semiconductor (MOS) sensors. In this project, we have been using an electronic nose based on an array of six MOS sensors, to recognize the presence of lung cancer in breaths' subjects, diagnosing the disease with a non invasive and low cost method.

During a first pilot study of our research, we have evaluated the possibility and accuracy of lung cancer diagnosis by classifying the olfactory signal associated to exhalations of subjects. Results have been very satisfactory and promising: we achieved an average accuracy of 92.6%, sensitivity of 95.3% and specificity of 90.5%. In particular we analyzed the breath of 101 individuals, of which 58 control subjects, and 43 suffer from different types of lung cancer (primary and not) at different stages. In order to find the components able to discriminate between the two classes ‘healthy’ and ‘sick’ at best, and to reduce the dimensionality of the problem, we have extracted the most significant features and projected them into a lower dimensional space using Non Parametric Linear Discriminant Analysis. Finally, we have used these features as input to several supervised pattern classification algorithms, based on different k-nearest neighbors (k-NN) approaches (classic, modified and Fuzzy k-NN), linear and quadratic discriminant classifiers and on a feed-forward artificial neural network (ANN). The observed results have all been validated using cross-validation.

The achieved satisfactory results pushed us to begin a new study, in order to confirm the obtained promising results and to evaluate the ripetibility of our results. We analyzed 104 breath samples of 52 subjects, 22 healthy subjects and 30 subjects with primary lung cancer at different stages. The acquisition has been done inviting subjects to breath into a nalophan bag, later input into the electronic nose. In order to find the best statistical model able to discriminate between the two classes ‘healthy’ and ‘lung cancer’ subjects, and to reduce the dimensionality of the problem, we implemented a genetic algorithm (GA) that found the best combination of feature selection, feature projection and classifier. In particular, according to the feature selection issue, we considered methods based on exponential, sequential and randomized algorithms. Principal Component Analysis (PCA), Fisher’s Linear Discriminant Analysis (LDA) and Non Parametric Linear Discriminant Analysis (NPLDA) have been considered to project features into a lower dimensional space. Classification has been performed implementing several supervised pattern classification algorithms, based on different k-nearest neighbors (k-NN) approaches (classic, modified and fuzzy k-NN), on linear and quadratic discriminant classifiers and on a feed-forward artificial neural network (ANN). The best solution provided from the genetic algorithm, has been the projection of the found subset of features into a single component using the Fisher’s Linear Discriminant Analysis (LDA) and a classification based on the k-Nearest Neighbours (k-NN) method. Performing a Student’s t-test between all pair of considered models, no significative differences emerged, suggesting that all computational intelligence methods that we have applied provided satisfying results. The observed results, all validated using cross-validation, have been very satisfactory achieving an average accuracy of 96.2%, an average sensitivity of 93.3% and an average specificity of 100%, as well as very small confidence intervals. These results confirmed a previous pilot study where we achieved an average accuracy of 92.6%, sensitivity of 95.3% and specificity of 90.5% (on 58 control subjects and 43 lung cancer subjects). We also investigated the possibility of performing early diagnosis, building a model able to predict a sample belonging to a subject with primary lung cancer at stage I, compared to healthy subjects. Also in this analysis results have been excellent, achieving an average accuracy of 92.85%, an average sensitivity of 75.5% and an average specificity of 97.72%.

The research demonstrate that an instrument as the electronic nose, combined with the appropriate artificial intelligence techniques, is a promising alternative to current lung cancer diagnostic techniques: the obtained predictive errors are lower than those achieved by present diagnostic methods, and the cost of the analysis, both in money, time and resources, is lower. Moreover, the instrument is completely non invasive. The introduction of this technology will lead to very important social and business effects: its low price and small dimensions allow a large scale distribution, giving the opportunity to perform non invasive, cheap, quick, and massive early diagnosis and screening.

Dates

Start date: 2007/01/01

End date: --

Website(s)

At the moment no website avaible

People involved

Project head(s)

A. Bonarini - Andrea Bonarini

M. Matteucci - Matteo Matteucci

PhD Students

R. Blatt - Rossella Blatt

Students currently working on the project

Michele Valsecchi - Michele Valsecchi

Claudio Trameri - Claudio Trameri

Mauro Verdirosa - Mauro Verdirosa

Students who worked on the project in the past
External personnel:

Dott. Ugo Pastorino (Istituto dei Tumori - Milano)

Dott. Elisa Calabrò (Istituto dei Tumori - Milano)

Laboratory work and risk analysis

Laboratory work for this project will be mainly performed at the Istituto Nazionale dei Tumori di Milano, where the acquisistion of subjects' breath, both sick and healthy will be done. For this kind of work, there are not potential risks.

Part 2: project description

Link to project documents and files

Results obtained from this work have been presented at different conferences:

  • Prestigious Applications of Intelligent Systems (PAIS 2008), Patras, Greece
The 5th Prestigious Applications of Intelligent Systems (PAIS 2008) is a sub-conference of the 18th European Conference on Artificial Intteligence (ECAI 2008) that will be held at the University of Patras, Greece, from July 21st to 25th.
File:PAIS.pdf
  • International Joint Conference on Neural Networks (IJCNN 2007), Orlando, FL, USA
Lung Cancer Identification by an Electronic Nose based on array of MOS Sensors, Blatt Rossella, Bonarini Andrea, Calabrò Elisa, Della Torre Matteo, Matteucci Matteo, Pastorino Ugo. Proceedings of the 2007 International Joint Conference on Neural Networks (IJCNN 2007), Orlando, FL, USA: File:IJCNNfinal.pdf
Short presentation of the Lung Cancer Identification by an Electronic Nose based on an array of MOS Sensors paper: File:LungCancerIdentificationIJCNN2007.pdf
  • International Workshop on Fuzzy Logic and Applications (WILF 2007), Ruta di Camogli, Genova, Italy
Fuzzy k-NN Lung Cancer Identification by an Electronic Nose, Blatt Rossella, Bonarini Andrea, Calabrò Elisa, Della Torre Matteo, Matteucci Matteo, Pastorino Ugo. Proceedings of the 7th International Workshop on Fuzzy Logic and Applications, WILF 2007, Lecture Notes in Computer Science (LNAI), LNAI 4578, pages 261-268, Springer. Camogli (GE), Italy, July 2007.