Cloud computing with Machine Learning could help us in the early diagnosis of breast cancer
- Junaid Ahmad Bhat, Prof. Vinai George and Dr. Bilal Malik
Abstract— The purpose of this study is to develop tools which could help the clinicians in the primary care hospitals with the early diagnosis of breast cancer diagnosis. Breast cancer is one of the leading forms of cancer in developing countries and often gets detected at the lateral stages. The detection of cancer at later stages results not only in pain and agony to the patients but also puts lot of financial burden on the caregivers. In this work, we are presenting the preliminary results of the project code named BCDM (Breast Cancer Diagnosis using Machine Learning) developed using Matlab. The algorithm developed in this work is based on adaptive resonance theory. (Explain the results of this work here ……..). The aim of the project is to eventually run the algorithm on a cloud computer and a clinician at a primary healthcare can use the system for the early diagnosis of the patients using web based interface from anywhere in the world.
Keywords— Adaptive Resonance theory, Breast Cancer Diagnosis, FNA
I. Introduction
The breast cancer is one of the common cancers and ranked second in the world after the lung cancer. (1)This type of cancer also ranked second in northern India. (1)Breast cancer is one of the leading cancers found in Kashmir (1) .Classifying the cells into the malignant and benign is the main goal in the diagnoses of breast cancer and misclassification could cost pain to the patients and extra burden to health care providers. Due to noise in the data, the problem to classify becomes non-trivial and has thus attracted researchers from machine learning to improve the classification.(2) Researchers have used different machine learning algorithms to improve the diagnosis of breast cancer. (3) And Neural Networks is one of the machine learning algorithms, which has been widely used for diagnosis of breast cancer.
Get Help With Your Essay
If you need assistance with writing your essay, our professional essay writing service is here to help!
In order to achieve the exactness Adaptive Resonance theory that is one of the variants of Neural Network been used for prediction purposes. Neural Network gained importance in 50’5 till late 60’s due to its accuracy and learning capabilities but got diminished in 80’s due to its computational cost. With the advancement in technology (4) Neural Networks are becoming popular due to their ability to achieve “non-linear hypotheses” even when input feature scale is large (4). This work proposes to use a variant of neural networks based on adaptive resonance theory to improve the breast cancer diagnosis. This algorithm has been developed and tested in Matlab 2012.has been tested on lot of real life problems that include automated automobile control, for classification purposes and for the detection of intruders in the battlefield.
II. Adaptive Resonance Theory (ART)
The Adaptive Resonance Theory (ART) is a neural network architecture that generates suitable weights (parameter) by clustering the pattern space. . The motive for adapting ART instead of a conventional neural network is to solve the stability and plasticity problem. (5) ART networks and algorithms keep the plasticity to learn new patterns and prevent the amendment of patterns that it learned earlier.
The stable network will not return the previous cluster. The operation of ART works as it accepts an input vector and classifies it into one of the clusters depending on to which cluster it resembles. If it will not match with any of the category then a new category is created by storing that pattern. When a store pattern is, bring into being that matches the input vector within a specified tolerance that made it to look like the input vector. The pattern will not be modified if it doesn’t match the current input pattern within the vigilance parameter. With the help of it the problems associated with stability and plasticity can be resolved. (5)
Figure 4 Art 1Neural Network Architecture
A. Types of Adaptive Resonance Theory
1) Adaptive Resonance Theory 1
It is the first neural network of Adaptive Resonance theory. It consists of two layers that cluster the pattern from the input binary vector. It accepts the input in the form of binary values (6).
2) Adaptive Resonance Theory 2
It is the second type of neural Network of Adaptive Resonance theory .It is complex than that of ART1 network and accepts the values in the form of continuous valued vector. The reason of complexity for ART 2 is that it possesses the normalization combination and noise inhibition alongside it compares the weights needed for the reset mechanism. (6)
B. Working of ART 1 Neural Network
The art Neural Networks works in the following fashion, which comprises of three layers and each layer has its own role to play.
1) Input layer
2) Interface layer
3) Cluster layer
The parameters used in algorithm are as:
- Num = Number of Symptoms
- M = Clusters as {benign ,Malignant}
- bwij =Bottom up weights
- Twij = Top down weights
- P =Vigilance parameter
- S = Binary forms of the input symptoms
- X = Activation vector for interface
- ||x|| =norm of x or sum of the components of x
- Step 1:
- Initialize Parameters
L > 1 and 0 < p <= 1
Initialize weights
0 < bwij (0) < L/(L-1) + num , tij (0)=1
- Step 2:
- While stop condition is false, perform step 3 to 14
- Step 3:
- For each training input do step 4 to 13
- Step 4:
- Set Activation of all F2 units to 0
- Set Activation of F1(a) units to binary forms of Symptoms vector
- Step 5:
- Compute the sum of the symptoms
||s|| = ∑ i Si
- Step 6:
- Send the symptom vector from input layer to interface layer
xi = si
- Step7:
- The cluster node that is not inhibited
If yj != -1 then yj = ∑ bij *xi
- Step8:
- While reset is true, perform step 9-12
- Step 9:
- Find J such that yi >= yj for all nodes j
If yj = -1 then
All then odds are inhibited thus cannot be clustered
- Step 10:
- Recomputed activation vector x of interface unit
Xi= si *tji
- Step 11:
- Compute the sum of the components of vector x
||x||= ∑I Xi
- Step 12:
- Test for reset condition
if ||x|| / ||s|| < p(vigilance parameter), then
Yj = -1 (inhibited node j)
Move to step step 8 again
if ||x|| / ||s|| >= p then move to next step
- Step 13:
- Update the bottom up weights and top up weights as:
bij (new)=L*xi / L – 1 +| |x||
and Tji (new)=xi
- Step 14:
- Test for the stopping condition
if((bij(new_val)==bij(previous_vreeal)))&&(tij(new_val)==tij(previous_val)))
III. Classifying Breast Cell
The data set for this research was taken from Mangasarian and Wolberg. This data set was obtained by taking Fine Needle Aspirates (FNA) approach. (7) This data set is available for public in UCI repository. (7) It contains 699 samples of patient’s consists of two classes 458 as benign cases and 451 malignant cases.
The following are the attributes of the database:
- Sample Code Number
- Clump Thickness
- Uniformity of Cell Size
- Uniformity of Cell Shape
- Marginal Adhesion
- Single Epithelial Cell Size
- Bare Nuclei
- Bland Chromatin
- Normal Nucleoli
- Mitosis
- Class
We have taken this data in its original form. This dataset is available in UC Irvine Machine Learning Repository (7)
IV. Experiment
Our Experiment consists of four different modules which is further divided and does work in the following sequence as given in the figure 5 below.
Figure 5: Modules of the Algorithm
A. Modules of the Experiment
1) Pre processing
In our dataset, not all the features are taking part in the classification process thus; we remove patient’s id feature. Then we left with ten attributes so we separate the feature set from the class values as Xij and Yi.
a) Data Normalization
After preprocessing stage Normalization of Xij (nine feature vectors) need to perform by using this equation:
New_val = (current _val – min value) / (Max value – min value)
Where,
New_val = New value after scaling
current_val = Current value of the feature vector
Max_val = Maximum value of each feature vector
Minvalue = Minimum value of each feature vector
b) Data Conversion
The new values (New_val) after getting from the previous step are truncated and converted into binary format. Then grouping was done on the base of range; the values falling in the range of 0 to 5 assigned as ‘0’. Whereas, values in the range from ‘5’ to ’10’ are assigned as 1.Then each sample as an input is given to ART1 network for training and testing purpose.
Find Out How UKEssays.com Can Help You!
Our academic experts are ready and waiting to assist with any writing project you may have. From simple essay plans, through to full dissertations, you can guarantee we have a service perfectly matched to your needs.
View our academic writing services
2) Recognition Stage
Initially all components of the input vector were assigned to zero because no sample was applied to the input layer. This sets the other two layers to zero there by disabling all the neurons and results in zero output. Since all neurons are at the same stage, thus every neuron has an equal chance to win. The input vector then applied in the recognition layer, at each neuron performs a dot product between the input vector and its weight vector. A neuron that comes with the greatest dot product possesses the weights that most excellent matches input vector. It inhibits all the other outputs from that neuron from that layer. This indicates the recognition layer stores the patterns in the form of weights associated with neurons one for each class.
3) Comparison Stage
In the recognition layer the network fired passes one back to the comparison layer when it passes the output signal. The comparison neurons that will fire are the one those receive simultaneously from the input feature vector and the comparison layer excitation vector. If there is a mismatch between these two, few neurons in the comparison layer will fire to the next layer until X got over. This means that the pattern P being feedback is not the one sought and neuron firing in the recognition layer should be inhibited. Then comparison of the symptoms vector and the inner layer vector and if the value is less then vigilance parameter, the network causes reset which causes the firing neuron in the recognition layer to zero and disable it for the current classification.
4) Search Stage
The classification process finishes if the reset signal is not generated. Otherwise other patterns were researched to find the correct match. This method continues until either all the stored pattern has been tried or all recognition neurons are inhibited.
V. Results
The performance of the Algorithm studied is as under:
The Training percentage and testing percentage total time taken and the relative efficiency when vigilance parameter is 0.5 is given by the chart.
Figure 6 : The classification performance on Vigilance parameter 0.5
The efficiency of the Network with vigilance parameter 0.7 on different percentage of training and testing sets given in figure 7. And on taking the vigilance parameter as 0.7 but on different percentage of training and testing dataset we got better efficiency than that of in figure 7 as shown in figure 8.
Figure 7: The Classification performance on Epoch 0.7
Figure 8: Calculation of Efficiency on different proportion of data
The efficiency of the Network with vigilance parameter 0.9 on different percentage of training and testing sets given as under:
Figure 9 : The Efficiency of the Network on Vigilance Parameter 0.9
The Maximum and Minimum time for training the Network on different tolerance factors is in the table as:
Value of Vigilance Parameter |
Maximum Time Taken for Training |
Minimum Time Taken for Training |
0.5 |
2.9985 |
0.5181 |
0.7 |
3.2434 |
0.3699 |
0.9 |
3.411 |
0.3395 |
Table 1: Calculation of Training time
VI. Conclusion
In this paper, we evaluated the adaptive resonance theory for the diagnosis of breast cancer using Wisconsin as data set. Several tests has been taken on different proportion of training and testing dataset and we concluded that by taking the vigilance parameter as 0.5 and taking the ratio of data as 90% for training and 10 % for testing we could achieve the better results.
Although we have taken into account all the parameters in the further scope of research, we use the feature selection process so that we can reduce the time and improve the accuracy. In addition to that, we take the dataset from the local hospital so that we use that for the benefit of the society.
References
- Journal of Cancer Research and Therapeutics. Afroz, Fir, et al. 2012, Vol. 8.
- Heart Disease Diagnosis using Support Vector. Shashikant Ghumbre, Chetan Patil,Ashok Ghatol. Pattaya: International Conference on Computer Science and Information Technology, Dec. 2011.
- Stefan Conrady, Dr. Lionel Jouffe. Breast Cancer Diagnostics with Bayesian Networks. s.l.: Bayesia, 2013.
- DONG, Yiping. A Study on Hardware Design for High Performance Artificial Neural Network by using FPGA and NoC . s.l.: Waseda University Doctoral Dissertation, July -2011.
- S N Sivanandan, S Sumathi , S N Deepa. Introduction to Neural Network and Matlab 6.0. s.l.: Tata Mc-Graw -Hill, 2006.
- Evaluation of Three Neural Network Models using Wisconsin Breast Cancer. K. Mumtaz, S. A. Sheriff,K. Duraiswamy.
- UCL Wisconsin data set. [Online] [Cited: 30 10 2014.] http://archive.ics.uci.edu/ml/datasets/Breast+Cancer+Wisconsin+(.
Cite This Work
To export a reference to this article please select a referencing style below: