ISSN: 2222-6990
Open access
Breast cancer remains a leading cause of mortality among women worldwide, with early detection being critical for reducing mortality rates. This study aimed to evaluate different ML algorithms for breast cancer prediction and to identify key features contributing to accurate classification. By utilizing various feature selection techniques and analyzing their impact on model performance, the study provides critical insights into optimizing machine learning models for breast cancer diagnostics. Using a dataset of mammographic images, this study evaluates various feature selection approaches to distinguish between benign and malignant cases. The selected features are analyzed using machine learning algorithms Naïve Bayes, logistic Functions, Sequential Minimal Optimization (SMO), and Decision Trees. Results demonstrate that effective feature selection enhances classification accuracy, with SMO and naïve bayes algorithms achieving the highest performance accuracy 96.996 with full features provided. This study also explores the influence of differences features selection technique to compare the performance of machine learning methods for breast cancer detection.
Ammu, P. K., & Preeja, V. (2013). Review on feature selection techniques of DNA microarray data. International Journal of Computer Applications, 61(12).
Ayde, C. C. N., Magda, M. M. I., Epifania, C. P. S., & Fred, T.-C. (2023). Prediction of breast cancer with 98% accuracy. arXiv. https://doi.org/10.48550/arXiv.2307.07571
Budach, L., Feuerpfeil, M., Ihde, N., Nathansen, A., Noack, N., Patzlaff, H., Naumann, F., & Harmouch, H. (2022). The effects of data quality on machine learning performance (Version 4). arXiv. https://doi.org/10.48550/ARXIV.2207.14529
Cruz, J. A., & Wishart, D. S. (2007). Applications of machine learning in cancer prediction and prognosis. Cancer Informatics, 2, 59–77.
Dhiman, P., Ma, J., Andaur Navarro, C. L., Speich, B., Bullock, G., Damen, J. A. A., Hooft, L., Kirtley, S., Riley, R. D., Van Calster, B., Moons, K. G. M., & Collins, G. S. (2022). Methodological conduct of prognostic prediction models developed using machine learning in oncology: A systematic review. BMC Medical Research Methodology, 22(1), 101.
Farooq, M. S., & Ilyas, M. (2023). Predicting environment effects on breast cancer by implementing machine learning. arXiv. https://doi.org/10.48550/arXiv.2309.14397
Li, B. N., Chui, C. K., Chang, S., & Ong, S. H. (2011). Integrating spatial fuzzy clustering with level set methods for automated medical image segmentation. Computers in biology and medicine, 41(1), 1-10.
Lisboa, P. J., & Taktak, A. F. (2006). The use of artificial neural networks in decision support in cancer: a systematic review. Neural networks, 19(4), 408-415.
Martinez, R. G., & van Dongen, D.-M. (2023). Pre-screening breast cancer with machine learning and deep learning. arXiv. https://doi.org/10.48550/arXiv.2302.02406
More, A. (2022). Breast cancer prediction using classification techniques of machine learning. International Journal for Research in Applied Science and Engineering Technology, 10(1), 51–57. https://doi.org/10.22214/ijraset.2022.39743
Ortiz, B. L., Gupta, V., Kumar, R., Jalin, A., Cao, X., Ziegenbein, C., ... & Choi, S. W. (2024). Data preprocessing techniques for ai and machine learning readiness: Scoping review of wearable sensor data in cancer care. JMIR mHealth and uHealth, 12(1), e59587.
Pk, A., & V, P. (2013). Review on feature selection techniques of DNA microarray data. International Journal of Computer Applications, 61(12), 39–44. https://doi.org/10.5120/9983-4814
Shafique, R., Rustam, F., Choi, G. S., Díez, I. D. L. T., Mahmood, A., Lipari, V., ... & Ashraf, I. (2023). Breast cancer prediction using fine needle aspiration features and up sampling with supervised machine learning. Cancers, 15(3), 681.
Tran, B. X., Latkin, C. A., Sharafeldin, N., Nguyen, K., Vu, G. T., Tam, W. W. S., Cheung, N.-M., Nguyen, H. L. T., Ho, C. S. H., & Ho, R. C. M. (2019). Characterizing artificial intelligence applications in cancer research: A latent Dirichlet allocation analysis. JMIR Medical Informatics, 7(4), e14401. https://doi.org/10.2196/14401
Turgut, S., Dagtekin, M., & Ensari, T. (2018). Microarray breast cancer data classification using machine learning methods. 2018 Electric Electronics, Computer Science, Biomedical Engineerings’ Meeting (EBBT), 1–3. https://doi.org/10.1109/EBBT.2018.8391468
Ravishankar, M., & Varalatchoumy, M. (2017, December). Four novel approaches for detection of region of interest in mammograms—A comparative study. In 2017 International Conference on Intelligent Sustainable Systems (ICISS) (pp. 261-265). IEEE.
Ray, P. P., Dash, D., & De, D. (2017). A systematic review of wearable systems for cancer detection: current state and challenges. Journal of Medical Systems, 41, 1-12.
Yao, Y., Lv, Y., Tong, L., Liang, Y., Xi, S., Ji, B., ... & Yang, J. (2022). ICSDA: a multi-modal deep learning model to predict breast cancer recurrence and metastasis risk by integrating pathological, clinical and gene expression data. Briefings in bioinformatics, 23(6), bbac448.
Din, M. B., Daud, P., Ismail, N. A., Ismail, N. L., & Nadi, F. (2025). A Comparative Study of Features Selections in Breast Cancer Classification Using Machine Learning Algorithms. International Journal of Academic Research in Business and Social Sciences, 15(2), 956–967.
Copyright: © 2025 The Author(s)
Published by HRMARS (www.hrmars.com)
This article is published under the Creative Commons Attribution (CC BY 4.0) license. Anyone may reproduce, distribute, translate and create derivative works of this article (for both commercial and non-commercial purposes), subject to full attribution to the original publication and authors. The full terms of this license may be seen at: http://creativecommons.org/licences/by/4.0/legalcode