Prediction of Type 2 Diabetes using Support Vector Machine (SVM) with Enhanced Levy Flight based Fruitfly Optimization Algorithm (ELFFOA) and Feature Selection Approaches


Ashok Kumar M, Vijay J, Sathya V, Dalvin Kumar, Shanmugam Sundararajan, Sivanantham V, Sanjoy Kumar Pal




Researchers have been leveraging various data analytics methods for Diabetes mellitus (DM) diagnosis, prognosis and management. The data analytics paradigm has become advanced and automated with the emergence of machine learning (ML) and deep learning (DL) algorithms. With new techniques, the prediction accuracy of ML models for various real-world problems has increased significantly. In our previous work, we introduced and investigated the Improved K-Means with Adaptive Divergence Weight Binary Bat Algorithm to create an innovative diagnosis system. Across several problem scenarios, the performance of this algorithm is much better in terms of speed. However, this algorithm's accuracy of data categorization comes below expectations. To achieve high classification accuracy, the objective of this study work is to concentrate on methods and strategies. This aim is fulfilled through a Support Vector Machine (SVM) with an Enhanced Levy Flight-based Fruitfly Optimization model. This novel model improves diabetes prediction accuracy and can be applied to regressions, classifications, and other tasks. The nearest training data points’ distances should be greater as this can lower classifiers’ generalization errors. Missing values in datasets are retrieved using the Adaptive Neuro Fuzzy Inference System (ANFIS). A new algorithm called the Enhanced Inertia Weight Binary Bat Algorithm (EIWBBA) is introduced to optimize feature spaces and eliminate unimportant aspects. Further on, a novel feature selection technique is introduced by using the Enhanced Generalized Lambda Distribution Independent Component Analysis (EGLD-ICA). The classification uses a Support Vector Machine with an Enhanced Levy flight-based Fruitfly Optimization Algorithm (SVM-ELFFOA). The SVM-ELFFOA classification techniques are implemented using MATLAB software. It is evident that the discussed IKM-EIWBBA+SVM-ELFFOA classifier produces much better values of the accuracy of 93.50%, while the available IKM-EIWBBA+SVM yields 91.87%, IKM-ADWFA+LR renders 90.50%, and IKM+LR renders just 85.00%. From the simulation experiment, the proposed classification techniques implemented in MATLAB software and according to comparative data, this suggested model has a higher prediction accuracy of 93.50% compared to existing classification methods.