Introduction

Oncology Letters

1792-1074 1792-1082

D.A. Spandidos

10.3892/ol.2018.9761

OL-0-0-9761

Articles

Diagnosis of mesothelioma with deep learning

Xue

Zebo

Department of Blood Transfusion, The First Affiliated Hospital of Chongqing Medical University, Chongqing 400016, P.R. China

Correspondence to: Professor Zebo Yu, Department of Blood Transfusion, The First Affiliated Hospital of Chongqing Medical University, 1 Youyi Road, Yuzhong, Chongqing 400016, P.R. China, E-mail: zeboyuchongqing@163.com

02 2019

26 11 2018

17 2 1483 1490 26032018 03102018

2019

This is an open access article distributed under the terms of the Creative Commons Attribution-NonCommercial-NoDerivs License, which permits use and distribution in any medium, provided the original work is properly cited, the use is non-commercial and no modifications or adaptations are made.

Malignant mesothelioma (MM) is a rare but aggressive cancer. The definitive diagnosis of MM is critical for effective treatment and has important medicolegal significance. However, the definitive diagnosis of MM is challenging due to its composite epithelial/mesenchymal pattern. The aim of the current study was to develop a deep learning method to automatically diagnose MM. A retrospective analysis of 324 participants with or without MM was performed. Significant features were selected using a genetic algorithm (GA) or a ReliefF algorithm performed in MATLAB software. Subsequently, the current study constructed and trained several models based on a backpropagation (BP) algorithm, extreme learning machine algorithm and stacked sparse autoencoder (SSAE) to diagnose MM. A confusion matrix, F-measure and a receiver operating characteristic (ROC) curve were used to evaluate the performance of each model. A total of 34 potential variables were analyzed, while the GA and ReliefF algorithms selected 19 and 5 effective features, respectively. The selected features were used as the inputs of the three models. SSAE and GA+SSAE demonstrated the highest performance in terms of classification accuracy, specificity, F-measure and the area under the ROC curve. Overall, the GA+SSAE model was the preferred model since it required a shorter CPU time and fewer variables. Therefore, the SSAE with GA feature selection was selected as the most accurate model for the diagnosis of MM. The deep learning methods developed based on the GA+SSAE model may assist physicians with the diagnosis of MM.

deep learning stacked sparse autoencoder malignant mesothelioma diagnosis

Introduction

Malignant mesothelioma (MM) is a rare but aggressive cancer (1). The prognosis of patients with MM is poor since the majority of patients are diagnosed at an advanced stage and MM is resistant to current treatment options, including chemotherapy, surgery, radiotherapy and immunotherapy (2). The estimated median overall survival for advanced MM is 1 year following diagnosis (3). MM has a strong association with exposure to asbestos, a mineral extensively used worldwide in the 1970-80s. Although the use of asbestos has been prohibited in the 21st century, the incidence rate of MM has continued to increase worldwide due to the long latency period of MM (2,4).

Diagnosis of MM primarily relies on histopathological examination supported by clinical and radiological evidence (5). The definitive diagnosis of MM is a crucial step prior to appropriate treatment and has important medicolegal significance due to diagnosis-related compensation issues (6). However, the definitive diagnosis of MM can be complex, particularly during early stages. This is due to significant variation between cases and the presence of traits that mimic other cancers (particularly adenocarcinoma) or benign/reactive processes (6). Furthermore, given the low frequency of MM, it is commonly misdiagnosed or not identified due to a lack of experienced pathologists (1,6).

In the last two decades, the use of diagnosis support systems (DSSs) has gradually increased (7–9). A DSS for MM may enable pathologists to rapidly examine medical data in considerable detail. More importantly, it may reduce the variability that occurs with different pathologists. Previous studies regarding computer-aided diagnosis of MM mainly focused on developing automatic image processing approaches, including methods that can automatically detect and quantitatively assess pleural thickening in thoracic computed tomography images (10–12). However, pleural thickening does not exclusively signify an asbestos-related disease (3), therefore previous methods possess limitations for identifying MM. To increase the accuracy of the DSS, it is necessary to effectively combine multi-feature data, including clinical, laboratory and radiological characteristics of MM. Er et al (7) collected multi-feature data to diagnose MM. The authors achieved 96.30% classification accuracy by using a probabilistic neural network (PNN). However, multi-feature data may not contribute equally to the identification of MM. Feature selection methods may be utilized to remove the redundant or irrelevant features from the original feature set (9); this may aid the diagnosis model to focus on the most discriminative features in order to achieve a higher accuracy and decrease the learning time.

Disease diagnosis is a major classification issue; therefore, the classifier is critical to the DSS. Previously, a number of machine leaning methods have been used as classifiers to assist the diagnosis of disease, including support vector machine (13,14), extreme learning machine (ELM) (15) and deep learning (DL; also termed Hierarchical Learning) (16,17). DL facilitates computational models, consisting of multiple processing layers, to learn representations of data with multiple levels of abstraction (18). DL is a family of computational methods, which includes the restricted Boltzmann machine (19), stacked autoencoder (20), deep belief networks (21) and convolutional neural networks (22). These algorithms markedly improve speech recognition, object detection and visual object recognition (16,23).

An autoencoder is a type of neural network including three layers: input layer, hidden layer, and output layer (20). Fig. 1 presents the architecture of a basic autoencoder with ‘encoder’ and ‘decoder’ networks. An extension of an autoencoder, the sparse autoencoder (SAE) introduces a spare constraint on the hidden layer (20). Its algorithm steps are as follows: The given dataset, X={x(1), x(2),…, x(i)… x(N)}, x(i) ∈R^M is mapped to the hidden layer with a nonlinear activation function:

Z=f(W1X+b1)=11+exp[-(W1X+b1)]

Then, the resulting hidden representation Z is mapped back to a reconstructed vector h_w,b(x) in the input space.

hw,b(x)=f(W2Z+b2)=11+exp[-(W2Z+b2)]

In the aforementioned formulas, X represents the feature expression of the original data, R represents real numbers; N represents the number of data samples and M represents the length of each data sample. W₁ represents the weight matrix and b₁ represents the bias of the encoder. f represents the encoder activation function. Subsequently, the parameter set, W₁, b₁, W₂, b₂, are optimized by minimizing error between the input and reconstructed data. The cost function of the SAE can be written as:

J(W,b)=1N∑i=1N(12‖hw,b(x(i))-x(i)‖2)+λ2∑l=12∑i=1Sl∑j=1Sl+1(Wjil)2+β∑j=1S2KL(ρ‖ρˆj)

In this formula, N represents the number of data samples, h_w,b(x(i)) represents the output feature vector, x(i) represents the input feature vector, λ represents the weight decay parameter, S_l represents the number of neurons in layer l, β represents the weight of sparsity penalty term, W^l_ji represents the weight on the connection between neuron j in layer l+1 and neuron i in layer l, ρ represents the sparsity parameter defining the sparsity level and ρ_j represents the average activation of hidden neuron j. ^{KL(ρ‖ρˆi)} represents the Kullback-Leibler divergence between ^ρ and ^ρˆi.

By minimizing J(W,b) the optimal parameters of W and b are updated in the process of coding. The parameters W and b in each iteration can be updated as:

Wjil=Wjil-ε∂∂WjilJ(W,b)andbl=bl-ε∂∂blJ(W,b)

In these formulas, parameter ε represents the learning rate; l represents the l^th layer of the network; i and j denote the i^th and j^th neurons of two neighboring layers, respectively. J(W,b) represents the cost function of the SAE and ∂ represents seeking partial derivative.

Stacked sparse autoencoder (SSAE) is a newly developed DL algorithm and a feed-forward neural network consisting of multiple layers of basic autoencoder. SSAE has successfully been employed to identify visual features in computer vision (24,25), predict protein solvent accessibility and contact number (26), predict protein-protein interactions (27) and construct an electronic nose system (28). Similar to other DL algorithms, SSAE can efficiently use a deep or layered architecture to identify potential representations from original clinical features, therefore enhancing the classification accuracy (20).

However, the performance of classification algorithms may differ from one dataset to another (29). Different classification algorithms may be used to develop and compare several models, allowing the best solution to be selected based on the dataset (29). In the current study, the GA and ReliefF methods were used to select high discriminative features and three commonly used machine leaning algorithms, BP, ELM and SSAE were selected as the classification algorithms. The performance of these models was compared using evaluation metrics, including classification accuracy, specificity, F-measure and AUC. The model with the best performance may be considered as a classifier to be used when developing a DSS to diagnose MM.

Materials and methods <sec> <title>Data source

To facilitate comparisons with previous studies, the current study used a dataset obtained from the University of California (Irvine, CA, USA) machine learning database (7). The original dataset includes 324 participants. For each participant, 34 physiological variables were recorded, as presented in Table I.

Data preprocessing and feature selection

The original dataset was classified into two groups by experienced pathologists: 97 patients with MM and 227 healthy participants. For each participant, 34 variables were recorded and no data were missing. Multiplicity of the features may lead to over-training a model. Therefore, a frequently used preprocessing method is feature selection, in which irrelevant, weakly relevant or less discriminative features are removed. Feature selection can increase the accuracy of the resulting model (30,31). Previously, a variety of feature selection algorithms have been developed to perform feature selection (32,33). The current study used the GA (34) and the ReliefF algorithm (30) with MATLAB software (R2016b 9.1.0.441655); MathWorks, Natick, MA, USA). The parameters of GA were set as follows: Number of generations=100, population size=20, encoding length=34 and the crossover rate=2. While ReliefF methods were used to select high discriminative features, the parameters of ReliefF were set as follows: Number of iterations=323 and the number of close samples=95. The important features selected by GA and ReliefF were used as inputs to train the diagnosis models. For training these diagnosis models, the dataset was randomly divided into training (70%) and testing (30%) data.

Construction of the diagnosis models

Previous studies have demonstrated that the most effective model is not always easily identified. Therefore, testing various diagnosis models is required to select the most effective one. In the current study, a number of diagnosis models, including BP, ELM and SSAE, were developed using MATLAB software. Selected features were used as the inputs for these recognition models.

The algorithm principle for BP can be reviewed in a previous study (35). In the current study, the parameters of BP were set as follows: Size of hidden layers=50, transfer function of hidden layers=‘tansig’, transfer function of output layer=‘purelin’, training function=‘trainlm’, epoch =1,000, goal of training accuracy=0.1 and the learning rate=0.1. Other associated parameters were based on the default values of MATLAB.

In addition, an ELM algorithm was selected to build a pattern recognition model. ELM is a learning algorithm, which is developed on the basis of a single-hidden layer feed-forward neural network (15). ELM can achieve a faster learning speed and improved generalization capability compared with other pattern recognition models by distributing the input weights and hidden layer biases randomly and determining the output weights through a Moore-Penrose generalized inverse operation of the weight matrices in hidden layers (36). Further information regarding ELM can be reviewed in a previous study (37). In the current study, the number of hidden nodes was set to 30 and the sigmoidal function was adopted as the transfer function.

The SSAE is a feed-forward neural network consisting of multiple layers of basic autoencoder (28). To reduce the dimensions and extract high-level abstract features from the input without labels, two autoencoders were stacked to generate an SSAE in the current study. When using an SSAE, features calculated by the first autoencoder are used as the inputs for the second autoencoder's hidden layer, as presented in Fig. 2. When the expected error rate is achieved by the first autoencoder, the high-level abstract features are extracted by the second autoencoder (38,39). In the current study, a supervised model, the Softmax classifier (SMC), was connected to the end of the trained SSAE to identify the classification task. To train the SMC, the high-level abstract features extracted by the SSAE were used as the inputs for the softmax layer. Following training, the SMC was stacked with SSAE to establish a deep neural network (Fig. 2). Further information regarding the SMC can be reviewed in previous studies (38,40).

The parameters of the first autoencoder were set as follows: When using raw data as the input the size of the hidden representation=30, when using data generated by dimension reducing processing as the input the size of the hidden representation=20, weight regularization for loss function=0.001, sparsity regularization for loss function=4 and sparsity proportion=0.05. The parameters of the second autoencoder were set as follows: Size of the hidden representation=10, weight regularization for loss function=0.001, sparsity regularization for loss function=4 and sparsity proportion=10. In addition, the current study selected the linear transfer function.

Performance measure

In order to evaluate and select the most accurate diagnostic model, confusion matrixes were adopted to calculate the sensitivity, specificity, precision and accuracy of each diagnostic model. Occasionally, contradictions occurred between the sensitivity and precision, therefore the F-measure was used to weight and average the two values: F=(2 × precision × sensitivity)/(precision+sensitivity).

In addition, a receiver operating characteristic (ROC) curve and the area under the ROC curve (AUC) were used to evaluate the performance of the diagnostic models. ROC analysis evaluated the capacity of the diagnostic models to distinguish between MM and healthy participants. All calculations were based on standard equations published in previous studies (31,41). Furthermore, the central processing unit (CPU) time was recorded and compared between the SSAE and GA+SSAE diagnostic models to indicate the computational complexities of each model.

Results <sec> <title>Identifying variables that are important for MM diagnosis by feature selection

The original dataset contained 34 variables for each participant. GA and ReliefF were performed to select the most significant features for diagnosing MM. 19 relevant features were selected as the feature subset based on GA. However, only five relevant features with a weight value >0.10 were selected as the feature subset based on the ReliefF algorithm. The feature subset based on GA included: Age, gender, city, duration of asbestos exposure, diagnosis method, cytology, ache on chest, weakness, smoking habit, white blood cell level, hemoglobin level, blood lactose dehydrogenase level, albumin level, glucose level, pleural lactose dehydrogenase level, pleural protein level, pleural glucose level, cell count and pleural level of acidity.

Comparing the diagnostic models

The current study compared the performance of the diagnosis models using four evaluation measures, as illustrated in Figs. 3–6. Overall, SSAE and GA+SSAE demonstrated a higher performance compared with the other diagnosis models evaluated.

When all 34 variables of the original data set were used to diagnosis MM, the SSAE model demonstrated the highest accuracy (100.00%), while other models demonstrated an accuracy <95%. When the 19 variables selected by GA were used as the input variables, both the GA+BP and GA+ELM models achieved an accuracy of 98.00%. However, GA+SSAE demonstrated the highest accuracy (100.00%). When the five variables selected by the ReliefF algorithm were used as the input variables, the accuracy of the ReliefF+SSAE model decreased to 98.00%, while the other models demonstrated accuracies <92% (Fig. 3).

The SSAE model demonstrated a specificity of 100.00% when all 34 variables of the original data set were used as the input variables. Similarly, when the 19 variables selected by GA were used as the inputs, the GA+SSAE model demonstrated a specificity of 100.00%. By contrast, the ReliefF+BP model achieved the lowest specificity of 89.10% (Fig. 4).

When all 34 variables of the original data set were used as the inputs, the SSAE model demonstrated the highest F-measure value (100.00%), while ELM achieved the lowest F-measure value (90.91%). When the 19 variables selected by GA were used as the input variables, both GA+BP and GA+ELM models demonstrated an F-measure value of 97.10%. With the same input variables, the GA+SSAE model achieved the highest F-measure value (100.00%). However, when the five variables selected by the ReliefF algorithm were used as the input variables, the F-measure value of the ReliefF+ELM model decreased markedly to 5.48%, while the F-measure value of the ReliefF+SSAE model was 97.12% (Fig. 5).

An evaluation was performed to compare the discriminatory power of the BP, ELM and SSAE models, with or without feature selection, as illustrated in Figs. 6 and 7. An ROC curve and the AUC were used to indicate the effectiveness of a diagnostic model to discriminate between patients with MM and healthy participants. The current study revealed that the SSAE and GA+SSAE models demonstrated the highest diagnostic power compared with the other models, as presented in Fig. 7.

Performing feature selection does not exclusively improve the performance of diagnostic models

As demonstrated in Figs. 3–7, performing GA markedly enhanced the classification performance of BP. Compared with BP alone, GA+BP achieved a higher accuracy (98.00 vs. 94.40%), specificity (98.4 vs. 95.30%), F-measure (97.10 vs. 92.73%) and AUC (97.75 vs. 94.72%). Whereas, compared with BP alone, performing a ReliefF algorithm generated a lower accuracy (90.80 vs. 94.90%), specificity (89.10 vs. 95.30%), F-measure (87.69 vs. 92.73%) and AUC (80 vs. 94.72%).

As demonstrated in Fig. 4, performing GA or ReliefF prior to training the ELM classifiers markedly improved the specificity. Compared with ELM alone, the GA+ELM model revealed a higher accuracy (98.00%), specificity (98.40%), F-measure (97.10%) and AUC value (97.75%). However, comparison of the accuracy, F-measure and AUC demonstrated that performing the ReliefF algorithm reduced the performance of the ELM model on the same dataset.

The performance of SSAE in diagnosing MM was compared with that of the GA+SSAE and ReliefF+SSAE models. In comparison with SSAE without feature selection, performing the ReliefF algorithm achieved a worse performance. However, both SSAE and SSAE+GA achieved an accuracy, specificity, F-measure and AUC of 100%. The average CPU time of the SSAE diagnostic model following GA feature selection was compared with the average CPU time of the baseline SSAE diagnostic model without feature selection. Each model was performed ten times. the mean CPU time of SSAE was 33.2 sec. However, following GA feature selection, the mean CPU time of SSAE decreased to 28.8 sec.

Discussion

The definitive diagnosis of MM is significant at both the individual and public health level, and has important medicolegal significance due to diagnosis-related compensation issues (6). However, the definitive diagnosis of MM is complicated due to its composite epithelial/mesenchymal pattern and its low occurrence frequency (6). To improve the diagnosis of MM at an early stage, the current study designed and implemented a diagnostic model based on SSAE algorithms.

To the best of our knowledge, no previous studies have developed MM diagnostic models using SSAE algorithms. However, previous studies have successfully applied SSAE in processing diagnosis information (38,40,42). A number of previous studies have compared the performance of different statistical and machine learning techniques in the field of medical diagnosis; the results revealed that the performance of SSAE is higher than several other similar techniques (42,43), a finding that is consistent with the current study.

However, the performance and precision of a classification algorithm may vary depending on the dataset (44). For a specific dataset, it is difficult to identify which classification algorithm has the highest performance without performing a comparison. In addition, feature selection methods exhibit a marked influence on the performance of a classification algorithm; a feature subset that improves the performance of one classification algorithm may not improve the performance of a different classification algorithm. To generate a diagnostic model for MM and evaluate its performance, three different algorithms with or without feature selection were applied on the same training dataset and their performances were compared in the current study.

The results indicated that the SSAE and GA+SSAE models exhibited the highest overall performance; the accuracy, specificity, F-measure and AUC for both models were 100%. However, following feature selection with GA a decrease was identified in the CPU time for training the SSAE diagnostic model compared with the baseline SSAE diagnostic model without GA. Furthermore, the GA+SSAE model required fewer variables to achieve the same performance as SSAE. The ReliefF+ELM diagnosis model exhibited the worst performance on the dataset used in the current study, with an accuracy, F-measure and AUC of 65.30, 5.48 and 50.00%, respectively. In addition, performing feature selection did not exclusively produce an improved performance. The accuracy of the ReliefF+ELM model was 65.3%, while the accuracy of ELM without feature selection was 93%. Similarly, the ReliefF algorithm reduced the performance of the SSAE model.

The results from the current study were compared with a previous study that used the same training dataset (7). The previous study compared the performance of three classification methods and identified that the classification accuracies were 96.30, 94.41 and 91.14% for PNN, multilayer neural network and learning vector quantization structures, respectively. The highest classification accuracy was obtained using a PNN (with 3 fully connected layers, response surface method) structure (96.30%) (7,45). In comparison, the SSAE and GA+SSAE methods achieved a classification accuracy of 100% on the same training dataset, in the current study.

The current study demonstrates the effectiveness of DL in the diagnosis of cancer. A pathologist's clinical impression and diagnosis of MM is based on contextual factors, whereas a GA+SSAE model has the ability to diagnose MM with high accuracy and may augment clinical decision-making. Furthermore, this fast and scalable method may be applied to other clinical datasets for the diagnosis or prediction of other cancer types.

In conclusion, the current study has aimed to improve the diagnosis of MM by evaluating three deep learning algorithms, using a dataset containing 97 patients with MM and 227 healthy participants. To avoid over-training and improve the classification accuracy of a diagnosis model, ReliefF and GA feature selection algorithms were implemented to remove the irrelevant or weakly relevant features. The current study identified that the GA+SSAE algorithm exhibited the highest performance in all evaluation criteria and required the smallest number of variables. The GA+SSAE algorithm-based DSS may contribute to the definitive diagnosis of MM. Consequently, the current study may assist pathologists with the diagnosis of MM by providing a system that can achieve optimal diagnostic performance. In addition, it may facilitate the screening of high-risk individuals in regions where asbestos exposure is common.

Acknowledgements

Not applicable.

Funding

The current study was supported by the Foundation and Frontier Research Project of Chongqing (grant no. cstc2016jcyjA0526), the Science and Technology Research Project of Chongqing Municipal Education Committee (grant no. KJ1600519), the Technology Innovation Project of Social Undertakings and Livelihood Security of Chongqing (grant no. cstc2017shmsA30016) and the Postgraduate Science and Technology Innovation Project of Chongqing (grant nos. CYS17215 and CYS17203).

Availability of data and materials

The datasets analyzed during the current study are available in the University of California (Irvine, CA, USA) machine learning database, https://archive.ics.uci.edu/ml/datasets/Mesothelioma %C3%A2%E2%82%AC%E2%84%A2s+disease+data+set+.

Authors' contributions

ZY and XH contributed to the conception of the study. XH contributed to analysis and manuscript preparation. ZY and XH performed the data analyses and wrote the manuscript. ZY and XH helped perform the analysis with constructive discussions.

Ethics approval and consent to participate

The current study used a dataset obtained from a public database. The data submitters have gain informed consent for publication of the dataset from participants at the point of recruitment to the trial.

Patient consent for publication

Not applicable.

Competing interests

The authors declare that they have no competing interests.

References 1

Remon

Lianes

Martinez

Velasco

Querol

Zanui

Malignant mesothelioma: New insights into a rare disease

Cancer Treat Rev395845912013

10.1016/j.ctrv.2012.12.005

23276688

Remon

Reguart

Corral

Lianes

Malignant pleural mesothelioma: New hope in the horizon with novel therapeutic strategies

Cancer Treat Rev4127342015

10.1016/j.ctrv.2014.10.007

25467107

Thellung

Favoni

Wurth

Nizzari

Pattarozzi

Daga

Florio

Barbieri

Molecular pharmacology of malignant spleural amesothelioma: Challenges and perspectives from preclinical and clinical studies

Curr Drug Targets178248492016

10.2174/1389450116666150804110714

26240051

Zucali

Ceresoli

De Vincenzo

Simonelli

Lorenzi

Gianoncelli

Santoro

Advances in the biology of malignant pleural mesothelioma

Cancer Treat Rev375435582011

10.1016/j.ctrv.2011.01.001

21288646

Ascoli

Murer

Nottegar

Luchini

Carella

Calabrese

Lunardi

Cozzi

Righi

What's new in mesothelioma

Pathologica11012282018

30259910

Ascoli

Pathologic diagnosis of malignant mesothelioma: Chronological prospect and advent of recommendations and guidelines

Ann Ist super sanita5152592015

25857384

Tanrikulu

Abakay

Temurtas

An approach based on probabilistic neural network for diagnosis of Mesothelioma's disease

Comput Electr Eng3875812012

10.1016/j.compeleceng.2011.09.001

Avci

Dogantekin

An expert diagnosis system for parkinson disease based on genetic algorithm-wavelet kernel-extreme learning machine

Parkins Dis201652647432016

Liu

Wang

Zhang

Zhu

Wang

A hybrid classification system for heart disease diagnosis based on the RFRS method

Comput Math Methods Med201782720912017

10.1155/2017/8272091

28127385

Chaisaowong

Aach

Jager

Vogel

Knepper

Kraus

Computer-assisted diagnosis for early stage pleural mesothelioma: Towards automated detection and quantitative assessment of pleural thickening from thoracic CT images

Methods Inf Med463243312007

10.1160/ME9050

17492119

Chen

Helm

Joshi

Gleeson

Brady

Computer-aided volumetric assessment of malignant pleural mesothelioma on CT using a random walk-based method

Int J Comput Assist Radiol Surg125295382017

10.1007/s11548-016-1511-3

28028655

Armato

IIISensakovic

Automated lung segmentation for thoracic CT impact on computer-aided diagnosis

Acad Radiol11101110212004

10.1016/j.acra.2004.06.005

15350582

Jiang

You

Pei

Zou

Zhang

Wang

Sun

Luo

Huang

Chen

Development of a ten-signature classifier using a support vector machine integrated approach to subdivide the M1 stage into M1a and M1b stages of nasopharyngeal carcinoma with synchronous metastases to better predict patients' survival

Oncotarget7364536572016

26636646

Huang

Chen

Lin

Tsai

SVM and SVM ensembles in breast cancer prediction

PLoS One12e01615012017

10.1371/journal.pone.0161501

28060807

Huang

Song

You

Trends in extreme learning machines: A review

Neural Netw6132482015

10.1016/j.neunet.2014.10.001

25462632

Ortiz

Munilla

Gorriz

Ramirez

Ensembles of deep learning architectures for the early diagnosis of the Alzheimer's disease

Int J Neural Syst2616500252016

10.1142/S0129065716500258

27478060

Komura

Ishikawa

Machine learning methods for histopathological image analysis

Comput Struct Biotechnol J1634422018

10.1016/j.csbj.2018.01.001

30275936

LeCun

Bengio

Hinton

Deep learning

Nature5214364442015

10.1038/nature14539

26017442

van Tulder

de Bruijne

Combining generative and discriminative representation learning for lung CT analysis with convolutional restricted boltzmann machines

IEEE Trans Med Imag35126212722016

10.1109/TMI.2016.2526687

Shin

Orton

Collins

Doran

Leach

Stacked autoencoders for unsupervised feature learning and multiple organ detection in a pilot study using 4D patient data

IEEE Trans Pattern Anal Mach Intell35193019432013

10.1109/TPAMI.2012.277

23787345

Brosch

Tam

Efficient training of convolutional deep belief networks in the frequency domain for application to high-resolution 2D and 3D images

Neural Comput272112272015

10.1162/NECO_a_00682

25380341

Esteva

Kuprel

Novoa

Swetter

Blau

Thrun

Dermatologist-level classification of skin cancer with deep neural networks

Nature5421151182017

10.1038/nature21056

28117445

Grewal

Oloumi

Rubin

Tennant

MTS

Deep learning in ophthalmology: A review

Can J Ophtalmol533093132018

10.1016/j.jcjo.2018.04.019

Yan

Tan

Zhao

Building extraction based on an optimized stacked sparse autoencoder of structure and training samples using LIDAR DSM and optical images

Sensors (Basel)17E19572017.25

10.3390/s17091957

28837118

Jia

Yang

Wang

Three-category classification of magnetic resonance hearing loss images based on deep autoencoder

J Med Syst411652017

10.1007/s10916-017-0814-4

28895033

Deng

Fan

Zeng

A sparse autoencoder-based deep neural network for protein solvent accessibility and contact number prediction

BMC Bioinformatics185692017

10.1186/s12859-017-1971-7

29297299

Wang

You

Jiang

Chen

Zhou

Wang

Predicting protein-protein interactions from protein sequences by a stacked sparse autoencoder deep neural network

Mol BioSyst13133613442017

10.1039/C7MB00188F

28604872

Zhao

Meng

Zeng

Stacked sparse auto-encoders (SSAE) based electronic nose for chinese liquors classification

Sensors (Basel)17E28552017.29

10.3390/s17122855

29292772

Podolsky

Barchuk

Kuznetcov

Gusarova

Gaidukov

Tarakanov

Evaluation of machine learning algorithm utilization for lung cancer classification based on gene expression levels

Asian Pacific J Cancer Prev178358382016

10.7314/APJCP.2016.17.2.835

Wang

Chang

Feature selection methods for big data bioinformatics: A survey from the search perspective

Methods11121312016

10.1016/j.ymeth.2016.08.014

27592382

Chen

Hao

Feature extraction and classification of EHG between pregnancy and labour group using Hilbert-Huang transform and extreme learning machine

Comput Math Methods Med201779495072017

10.1155/2017/7949507

28316639

Urbanowicz

Meeker

La Cava

Olson

Moore

Relief-based feature selection: Introduction and review

J Biomed Inform851892032018

10.1016/j.jbi.2018.07.014

30031057

Urbanowicz

Olson

Schmitt

Meeker

Moore

Benchmarking relief-based feature selection methods for bioinformatics data mining

J Biomed Inform851681882018

10.1016/j.jbi.2018.07.015

30030120

Dai

Cheng

Sun

Zeng

Advances in feature selection methods for hyperspectral image processing in food industry applications: A review

Crit Rev Food Sci Nutr55136813822015

10.1080/10408398.2013.871692

24689555

Cao

Cui

Shi

Jiao

Big data: A parallel particle swarm optimization-back-propagation neural network algorithm based on mapreduce

PLoS One11e01575512016

10.1371/journal.pone.0157551

27304987

Song

Zhang

Automatic recognition of epileptic EEG patterns via extreme learning machine and multiresolution feature extraction

Expert Syst Appl40547754892013

10.1016/j.eswa.2013.04.025

Song

Crowcroft

Zhang

Automatic epileptic seizure detection in EEGs based on optimized sample entropy and extreme learning machine

J Neurosci Methods2101321462012

10.1016/j.jneumeth.2012.07.003

22824535

Xiang

Liu

Gilmore

Tang

Madabhushi

Stacked sparse autoencoder (SSAE) for nuclei detection on breast cancer histopathology images

IEEE Trans Med Imaging351191302016

10.1109/TMI.2015.2458702

26208307

Langkvist

Loutfi

Learning feature representations with a cost-relevant sparse autoencoder

Int J Neural Syst2514500342015

10.1142/S0129065714500348

25515941

Tsinalis

Matthews

Guo

Automatic sleep stage scoring using time-frequency analysis and stacked sparse autoencoders

Ann Biomed Eng44158715972016

10.1007/s10439-015-1444-y

26464268

Chen

Hao

Discriminating pregnancy and labour in electrohysterogram by sample entropy and support vector machine

J Med Imaging Health Inform75845912017

10.1166/jmihi.2017.2065

Janowczyk

Basavanhally

Madabhushi

Stain normalization using sparse auto encoders (StaNoSA): Application to digital pathology

Comput Med Imaging Graph5750612017

10.1016/j.compmedimag.2016.05.003

27373749

Xing

Kong

Xie

Zhang

Yang

Robust cell detection and segmentation in histopathological images using sparse reconstruction and stacked denoising auto encoders

Med Image Comput Comput Assist Interv93513833902015

27796013

Paydar

Niakan Kalhori

Akbarian

Sheikhtaheri

A clinical decision support system for prediction of pregnancy outcome in pregnant women with systemic lupus erythematosus

Int J Med Inform972392462017

10.1016/j.ijmedinf.2016.10.018

27919382

Somu

M R

Kirthivasan

V S

An improved robust heteroscedastic probabilistic neural network based trust prediction approach for cloud service selection

Neural Netw1083393542018

10.1016/j.neunet.2018.08.005

30245433

Figure 1.

Architecture of a basic autoencoder. V (1,2…n), entire training data; X (1,2…n), set of training data; b_h, bias of hidden layer; b_x, bias of input layer; h (1,2…m), feature vector at hidden layer; xˆ, reconstruction of input x.

Figure 2.

Architecture of sparse autoencoder. V (1,2…n), entire training data; X (1,2…n), set of training data; b_h, bias of hidden layer; and b_x, bias of input layer; h^(l) (1,2…m), feature vector at layer l; h⁽²⁾ (1,2…m), feature vector at layer 2.

Figure 3.

Comparison of the accuracy achieved by various diagnostic models. BP, backpropagation; GA, genetic algorithm; ELM, extreme learning machine; SSAE, stacked sparse autoencoder.

Figure 4.

Comparison of the specificity achieved by various diagnostic models. BP, backpropagation; GA, genetic algorithm; ELM, extreme learning machine; SSAE, stacked sparse autoencoder.

Figure 5.

Comparison of the F-measure for various diagnostic models. BP, backpropagation; GA, genetic algorithm; ELM, extreme learning machine; SSAE, stacked sparse autoencoder.

Figure 6.

ROC curves to assess the capacity of the diagnostic models to distinguish between patients with malignant mesothelioma and healthy participants. ROC, receiver operating characteristic; BP, backpropagation; GA, genetic algorithm; ELM, extreme learning machine; SSAE, stacked sparse autoencoder.

Figure 7.

Comparison of the area under the receiver operating characteristic curves for various diagnostic models. AUC, area under the curve; BP, backpropagation; GA, genetic algorithm; ELM, extreme learning machine; SSAE, stacked sparse autoencoder.

Table I.

Variables for diagnosing malignant mesothelioma.

Variable	Value type
Age	Quantitative
Sex	Qualitative
City	Qualitative
Asbestos exposure	Qualitative
Type of malignant mesothelioma	Qualitative
Duration of asbestos exposure	Quantitative
Diagnosis method	Qualitative
Keep side	Qualitative
Cytology	Qualitative
Duration of symptoms	Quantitative
Dyspnoea	Qualitative
Ache on chest	Qualitative
Weakness	Qualitative
Smoker	Qualitative
Performance status	Qualitative
White blood cell count	Quantitative
Hemoglobin	Quantitative
Platelet count	Quantitative
Sedimentation rate	Quantitative
Blood lactose dehydrogenase	Quantitative
Alkaline phosphatase	Quantitative
Total protein	Quantitative
Albumin	Quantitative
Glucose	Quantitative
Pleural lactose dehydrogenase	Quantitative
Pleural protein	Quantitative
Pleural albumin	Quantitative
Pleural glucose	Quantitative
Mortality	Qualitative
Pleural effusion	Qualitative
Pleural thickness on tomography	Qualitative
Pleural pH	Qualitative
Cell count	Quantitative
C-reactive protein	Quantitative