Difference between revisions of "Lung Cancer Detection by an Electronic Nose"

From AIRWiki
Jump to: navigation, search
 
(26 intermediate revisions by 3 users not shown)
Line 5: Line 5:
 
Lung Cancer Detection by an Electronic Nose
 
Lung Cancer Detection by an Electronic Nose
  
=== Project short description ===
+
=== Project description ===
  
The electronic nose is an instrument able to detect and recognize odors, that is the volatile substances in the atmosphere or emitted by the analyzed substance. In particular we will implement a system able to recognize lung cancer diseased individuals from healthy individuals, analyzing only their breath. Now we want to acquire a larger number of samples, considering a higher number of classes.
+
The electronic nose is an instrument able to detect and recognize odors, that is the volatile substances in the atmosphere or emitted by the analyzed substance. This device can react to a gas substance by providing signals that can be analyzed to classify the input. It is composed of a sensor array (MOS sensors, in our case) and a pattern classification process based on machine learning techniques. Each sensor reacts in a different way to the analyzed substance, providing multidimensional data that can be considered as a unique olfactory blueprint of the analyzed substance. In our work, we used an array composed of six Metal Oxide Semiconductor (MOS) sensors.
 +
In this project, we have been using an electronic nose based on an array of six MOS sensors, to recognize the presence of lung cancer in breaths' subjects, diagnosing the disease with a non invasive and low cost method.
 +
 
 +
During a first pilot study of our research, we have evaluated the possibility and accuracy of lung cancer diagnosis by classifying the
 +
olfactory signal associated to exhalations of subjects. Results have been very satisfactory and promising: we achieved an average accuracy of 92.6%, sensitivity of 95.3% and specificity of 90.5%. In particular we analyzed the breath of 101 individuals, of which 58 control subjects, and 43 suffer from
 +
different types of lung cancer (primary and not) at different stages.
 +
In order to find the components able to discriminate between the two classes ‘healthy’ and ‘sick’ at best, and to reduce the dimensionality
 +
of the problem, we have extracted the most significant features and projected them into a lower dimensional space using Non Parametric
 +
Linear Discriminant Analysis. Finally, we have used these features as input to several supervised pattern classification algorithms, based
 +
on different k-nearest neighbors (k-NN) approaches (classic, modified and Fuzzy k-NN), linear and quadratic discriminant classifiers
 +
and on a feed-forward artificial neural network (ANN). The observed results have all been validated using cross-validation.
 +
 
 +
The achieved satisfactory results pushed us to begin a new study, in order to confirm the obtained promising results and to evaluate the ripetibility of our results. We analyzed 104 breath samples of 52 subjects, 22 healthy subjects and 30 subjects with primary lung cancer at different stages. The acquisition has been done inviting subjects to breath into a nalophan bag, later input into the electronic nose. In order to find the best statistical model able to discriminate between the two classes ‘healthy’ and ‘lung cancer’ subjects, and to reduce the dimensionality of the problem, we implemented a genetic algorithm (GA) that found the best combination of feature selection, feature projection and classifier. In particular, according to the feature selection issue, we considered methods based on exponential, sequential and randomized algorithms. Principal Component Analysis (PCA), Fisher’s Linear Discriminant Analysis (LDA) and Non Parametric Linear Discriminant Analysis (NPLDA) have been considered to project features into a lower dimensional space. Classification has been performed implementing several supervised pattern classification algorithms, based on different k-nearest neighbors (k-NN) approaches (classic, modified and fuzzy k-NN), on linear and quadratic discriminant classifiers and on a feed-forward artificial neural network (ANN). The best solution provided from the genetic algorithm, has been the projection of the found subset of features into a single component using the Fisher’s Linear Discriminant Analysis (LDA) and a classification based on the k-Nearest Neighbours (k-NN) method. Performing a Student’s t-test between all pair of considered models, no significative differences emerged, suggesting that all computational intelligence methods that we have applied provided satisfying results. The observed results, all validated using cross-validation, have been very satisfactory achieving an average accuracy of 96.2%, an average sensitivity of 93.3% and an average specificity of 100%, as well as very small confidence intervals. These results confirmed a previous pilot study where we achieved an average accuracy of 92.6%, sensitivity of 95.3% and specificity of 90.5% (on 58 control subjects and 43 lung cancer subjects). We also investigated the possibility of performing early diagnosis, building a model able to predict a sample belonging to a subject with primary lung cancer at stage I, compared to healthy subjects. Also in this analysis results have been excellent, achieving an average accuracy of 92.85%, an average sensitivity of 75.5% and an average specificity of 97.72%.
 +
 
 +
The research demonstrate that an instrument as the electronic nose, combined with the appropriate artificial intelligence techniques, is a promising alternative to current lung cancer diagnostic techniques: the obtained predictive errors are lower than those achieved by present diagnostic methods, and the cost of the analysis, both in money, time and resources, is lower. Moreover, the instrument is completely non invasive. The introduction of this technology will lead to very important social and business effects: its low price and small dimensions allow a large scale distribution, giving the opportunity to perform non invasive, cheap, quick, and massive early diagnosis and screening.
  
 
=== Dates ===
 
=== Dates ===
Start date: 2008/01/31
+
Start date: 2007/01/01
  
End date: 2008/10/31
+
End date: --
  
 
=== Website(s) ===
 
=== Website(s) ===
Line 22: Line 36:
 
===== Project head(s) =====
 
===== Project head(s) =====
  
A. Bonarini - [[AndreaBonarini]]
+
A. Bonarini - [[User:AndreaBonarini | Andrea Bonarini]]
  
M. Matteucci - [[MatteoMatteucci]]
+
M. Matteucci - [[User:MatteoMatteucci | Matteo Matteucci]]
  
===== Other Politecnico di Milano people =====
+
===== PhD Students =====
  
R. Blatt - [[RossellaBlatt]]
+
R. Blatt - [[User:RossellaBlatt | Rossella Blatt]]
  
 
===== Students currently working on the project =====
 
===== Students currently working on the project =====
  
Claudio Trameri - [[ClaudioTrameri]]
+
Michele Valsecchi - [[User:MicheleValsecchi | Michele Valsecchi]]
  
Mauro Verdirosa - [[User:MauroVerdirosa]]
+
Claudio Trameri - [[User:ClaudioTrameri | Claudio Trameri]]
 +
 
 +
Mauro Verdirosa - [[User:MauroVerdirosa | Mauro Verdirosa]]
  
 
===== Students who worked on the project in the past =====
 
===== Students who worked on the project in the past =====
Line 40: Line 56:
 
===== External personnel: =====
 
===== External personnel: =====
  
dott. U. Pastorino (Istituto dei Tumori - Milano)
+
Dott. Ugo Pastorino (Istituto dei Tumori - Milano)
  
dott. E. Calabrò (Istituto dei Tumori - Milano)
+
Dott. Elisa Calabrò (Istituto dei Tumori - Milano)
  
 
=== Laboratory work and risk analysis ===
 
=== Laboratory work and risk analysis ===
''describe here what you actually do or will do in the AIRLab for your project; it is especially important to specify any activity that can lead to potential risks for you or other people and how you will ensure that no one is harmed. A list of potentially dangerous (if not correctly performed) activities is included into the [http://airlab.elet.polimi.it/index.php/airlab/content/download/461/4110/file/documento_valutazione_rischi_AIRLab.pdf Safety norms] of the AIRLab, which you MUST know to work at the AIRLab.''
 
  
Laboratory work for this project will be mainly performed at AIRLab/Lambrate. It will include significant amounts of mechanical work as well as of electrical and electronic activity. Potentially risky activities are the following:
+
Laboratory work for this project will be mainly performed at the Istituto Nazionale dei Tumori di Milano, where the acquisistion of subjects' breath, both sick and healthy will be done.  
* Use of mechanical tools. Standard safety measures described in [http://airlab.elet.polimi.it/index.php/airlab/content/download/461/4110/file/documento_valutazione_rischi_AIRLab.pdf Safety norms] will be followed.
+
For this kind of work, there are not potential risks.
* Use of soldering iron. Standard safety measures described in [http://airlab.elet.polimi.it/index.php/airlab/content/download/461/4110/file/documento_valutazione_rischi_AIRLab.pdf Safety norms] will be followed.
+
* Use of high-voltage circuits. Special gloves and a current limiter will be used.
+
* Transportation of heavy loads (e.g. robot parts).  Standard safety measures described in [http://airlab.elet.polimi.it/index.php/airlab/content/download/461/4110/file/documento_valutazione_rischi_AIRLab.pdf Safety norms] will be followed.
+
* Robot testing.  Standard safety measures described in [http://airlab.elet.polimi.it/index.php/airlab/content/download/461/4110/file/documento_valutazione_rischi_AIRLab.pdf Safety norms] will be followed.
+
* Death ray testing: on the robot will be mounted professor Azzoide's death ray projector. When testing it on live animals (e.g. pigeons, pigs, camels) we will make sure that people stay away from the test area.
+
  
 
== '''Part 2: project description''' ==
 
== '''Part 2: project description''' ==
''Put here all the scientific and technical information about the project. Feel free to insert titles and sections as needed. Remember that [http://www.mediawiki.org/wiki/Help:Contents here] you can find help about wiki syntax.''
 
  
Just to give you some ideas:
+
 
* state of the art;
+
 
* preliminary studies and sketches;
+
 
* design notes and guidelines;
+
 
* link to project documents and files (you can upload them using the [[Special:Upload]] page);
+
 
* description and results of experiments;
+
 
* photos and videos (they must have been uploaded with [[Special:Upload]] before you can insert them into this page);
+
=== Link to project documents and files ===
* link to source code of the software written for the project (you can upload it with [[Special:Upload]]);
+
 
* advice about the configuration and the use of hardware and software;
+
Results obtained from this work have been presented at different conferences:
* useful internet links;
+
 
* anything else that you think is useful to describe the project or could help people who will work on it in the future. Think about what ''you'' would have liked to find clearly explained when you started your work, instead of discovering it all by yourself the hard way. (By the way, if some of those missing information belong to other pages of this wiki, please update those pages: future users will be grateful.)
+
* '''Prestigious Applications of Intelligent Systems (PAIS 2008), Patras, Greece'''
 +
:The 5th Prestigious Applications of Intelligent Systems (PAIS 2008) is a sub-conference of the 18th European Conference on Artificial Intteligence (ECAI 2008) that will be held at the University of Patras, Greece, from July 21st to 25th.
 +
:[[Image:PAIS.pdf|Paper-PAIS2008]]  
 +
 
 +
* '''International Joint Conference on Neural Networks (IJCNN 2007), Orlando, FL, USA'''
 +
:'''Lung Cancer Identification by an Electronic Nose based on array of MOS Sensors''', Blatt Rossella, Bonarini Andrea, Calabrò Elisa, Della Torre Matteo, Matteucci Matteo, Pastorino Ugo. Proceedings of the 2007 International Joint Conference on Neural Networks (IJCNN 2007), Orlando, FL, USA: [[Image:IJCNNfinal.pdf|Paper-IJCNN2007]]  
 +
 
 +
:Short presentation of the ''Lung Cancer Identification by an Electronic Nose based on an array of MOS Sensors'' paper: [[Image:LungCancerIdentificationIJCNN2007.pdf|Presentation-IJCNN2007]]
 +
 
 +
* '''International Workshop on Fuzzy Logic and Applications (WILF 2007), Ruta di Camogli, Genova, Italy'''
 +
 
 +
: '''Fuzzy k-NN Lung Cancer Identification by an Electronic Nose''', Blatt Rossella, Bonarini Andrea, Calabrò Elisa, Della Torre Matteo, Matteucci Matteo, Pastorino Ugo. Proceedings of the 7th International Workshop on Fuzzy Logic and Applications, WILF 2007, Lecture Notes in Computer Science (LNAI), LNAI 4578, pages 261-268, Springer. Camogli (GE), Italy, July 2007.

Latest revision as of 18:26, 26 October 2009

Part 1: project profile

Project name

Lung Cancer Detection by an Electronic Nose

Project description

The electronic nose is an instrument able to detect and recognize odors, that is the volatile substances in the atmosphere or emitted by the analyzed substance. This device can react to a gas substance by providing signals that can be analyzed to classify the input. It is composed of a sensor array (MOS sensors, in our case) and a pattern classification process based on machine learning techniques. Each sensor reacts in a different way to the analyzed substance, providing multidimensional data that can be considered as a unique olfactory blueprint of the analyzed substance. In our work, we used an array composed of six Metal Oxide Semiconductor (MOS) sensors. In this project, we have been using an electronic nose based on an array of six MOS sensors, to recognize the presence of lung cancer in breaths' subjects, diagnosing the disease with a non invasive and low cost method.

During a first pilot study of our research, we have evaluated the possibility and accuracy of lung cancer diagnosis by classifying the olfactory signal associated to exhalations of subjects. Results have been very satisfactory and promising: we achieved an average accuracy of 92.6%, sensitivity of 95.3% and specificity of 90.5%. In particular we analyzed the breath of 101 individuals, of which 58 control subjects, and 43 suffer from different types of lung cancer (primary and not) at different stages. In order to find the components able to discriminate between the two classes ‘healthy’ and ‘sick’ at best, and to reduce the dimensionality of the problem, we have extracted the most significant features and projected them into a lower dimensional space using Non Parametric Linear Discriminant Analysis. Finally, we have used these features as input to several supervised pattern classification algorithms, based on different k-nearest neighbors (k-NN) approaches (classic, modified and Fuzzy k-NN), linear and quadratic discriminant classifiers and on a feed-forward artificial neural network (ANN). The observed results have all been validated using cross-validation.

The achieved satisfactory results pushed us to begin a new study, in order to confirm the obtained promising results and to evaluate the ripetibility of our results. We analyzed 104 breath samples of 52 subjects, 22 healthy subjects and 30 subjects with primary lung cancer at different stages. The acquisition has been done inviting subjects to breath into a nalophan bag, later input into the electronic nose. In order to find the best statistical model able to discriminate between the two classes ‘healthy’ and ‘lung cancer’ subjects, and to reduce the dimensionality of the problem, we implemented a genetic algorithm (GA) that found the best combination of feature selection, feature projection and classifier. In particular, according to the feature selection issue, we considered methods based on exponential, sequential and randomized algorithms. Principal Component Analysis (PCA), Fisher’s Linear Discriminant Analysis (LDA) and Non Parametric Linear Discriminant Analysis (NPLDA) have been considered to project features into a lower dimensional space. Classification has been performed implementing several supervised pattern classification algorithms, based on different k-nearest neighbors (k-NN) approaches (classic, modified and fuzzy k-NN), on linear and quadratic discriminant classifiers and on a feed-forward artificial neural network (ANN). The best solution provided from the genetic algorithm, has been the projection of the found subset of features into a single component using the Fisher’s Linear Discriminant Analysis (LDA) and a classification based on the k-Nearest Neighbours (k-NN) method. Performing a Student’s t-test between all pair of considered models, no significative differences emerged, suggesting that all computational intelligence methods that we have applied provided satisfying results. The observed results, all validated using cross-validation, have been very satisfactory achieving an average accuracy of 96.2%, an average sensitivity of 93.3% and an average specificity of 100%, as well as very small confidence intervals. These results confirmed a previous pilot study where we achieved an average accuracy of 92.6%, sensitivity of 95.3% and specificity of 90.5% (on 58 control subjects and 43 lung cancer subjects). We also investigated the possibility of performing early diagnosis, building a model able to predict a sample belonging to a subject with primary lung cancer at stage I, compared to healthy subjects. Also in this analysis results have been excellent, achieving an average accuracy of 92.85%, an average sensitivity of 75.5% and an average specificity of 97.72%.

The research demonstrate that an instrument as the electronic nose, combined with the appropriate artificial intelligence techniques, is a promising alternative to current lung cancer diagnostic techniques: the obtained predictive errors are lower than those achieved by present diagnostic methods, and the cost of the analysis, both in money, time and resources, is lower. Moreover, the instrument is completely non invasive. The introduction of this technology will lead to very important social and business effects: its low price and small dimensions allow a large scale distribution, giving the opportunity to perform non invasive, cheap, quick, and massive early diagnosis and screening.

Dates

Start date: 2007/01/01

End date: --

Website(s)

At the moment no website avaible

People involved

Project head(s)

A. Bonarini - Andrea Bonarini

M. Matteucci - Matteo Matteucci

PhD Students

R. Blatt - Rossella Blatt

Students currently working on the project

Michele Valsecchi - Michele Valsecchi

Claudio Trameri - Claudio Trameri

Mauro Verdirosa - Mauro Verdirosa

Students who worked on the project in the past
External personnel:

Dott. Ugo Pastorino (Istituto dei Tumori - Milano)

Dott. Elisa Calabrò (Istituto dei Tumori - Milano)

Laboratory work and risk analysis

Laboratory work for this project will be mainly performed at the Istituto Nazionale dei Tumori di Milano, where the acquisistion of subjects' breath, both sick and healthy will be done. For this kind of work, there are not potential risks.

Part 2: project description

Link to project documents and files

Results obtained from this work have been presented at different conferences:

  • Prestigious Applications of Intelligent Systems (PAIS 2008), Patras, Greece
The 5th Prestigious Applications of Intelligent Systems (PAIS 2008) is a sub-conference of the 18th European Conference on Artificial Intteligence (ECAI 2008) that will be held at the University of Patras, Greece, from July 21st to 25th.
File:PAIS.pdf
  • International Joint Conference on Neural Networks (IJCNN 2007), Orlando, FL, USA
Lung Cancer Identification by an Electronic Nose based on array of MOS Sensors, Blatt Rossella, Bonarini Andrea, Calabrò Elisa, Della Torre Matteo, Matteucci Matteo, Pastorino Ugo. Proceedings of the 2007 International Joint Conference on Neural Networks (IJCNN 2007), Orlando, FL, USA: File:IJCNNfinal.pdf
Short presentation of the Lung Cancer Identification by an Electronic Nose based on an array of MOS Sensors paper: File:LungCancerIdentificationIJCNN2007.pdf
  • International Workshop on Fuzzy Logic and Applications (WILF 2007), Ruta di Camogli, Genova, Italy
Fuzzy k-NN Lung Cancer Identification by an Electronic Nose, Blatt Rossella, Bonarini Andrea, Calabrò Elisa, Della Torre Matteo, Matteucci Matteo, Pastorino Ugo. Proceedings of the 7th International Workshop on Fuzzy Logic and Applications, WILF 2007, Lecture Notes in Computer Science (LNAI), LNAI 4578, pages 261-268, Springer. Camogli (GE), Italy, July 2007.