Home About Login Current Archives Announcements Editorial Board
Submit Now For Authors Call for Submissions Statistics Contact
Home > Archives > Volume 20, No 7 (2022) > Article

DOI: 10.14704/nq.2022.20.7.NQ33002

MALWARE DETECTION AND CLASSIFICATION USING ML ALGORITHMS

D Anil Kumar, Dr. Susanta Kumar Das

Abstract

A black hat Data on computers and mobile devices has been compromised in a variety of ways by hackers and data thieves. One such example is computer malware. It is possible for a stalker to download malware to a target computer, as well as for malware to be downloaded to a stalked user's system and data collected. Malware has a wide variety of tools available to accomplish its goals. There are so many different kinds of malware out there that it's tough for cyber analysts to identify and classify an attack's type of malware. This study's goal is to discover the best machine learning method for identifying malware and classifying its type in order to address this problem. A total of three different machine learning algorithms were used to create the three models. In addition to the XGBoost and LightGBM algorithms, there is also the LR algorithm. The preprocessed data is then used to train and test the models. Before preprocessing, the data is retrieved from Kaggle. This data contains details on numerous types of malware and the traits that distinguish them. The accuracy, precision, true-negative rate, false-negative rate, etc. of the three models are compared. Based on the results of the testing, the LightGBM algorithm is proven to be the most accurate and precise. Thus, it can be stated that the LightGBM algorithm is the best method for detecting and classifying malware. The dangers posed by harmful software (malware)-based cyber attacks have grown exponentially in recent years. These malicious programs have gotten more difficult because so many individuals utilize web programs on a daily basis. Data integrity, availability, and confidentiality have all been compromised as a result of recent attacks, raising serious questions about the state of information systems worldwide. Because of its inefficiency and time-consuming nature, manual inspection and categorization procedures were once thought to shed some light on this issue. The rapid propagation of high-rate malware necessitates an innovative way to classifying them as virus or non-malicious software. Machine learning is a revolutionary technique to malware classification in this regard. Some of the machine learning classifiers utilized in this paper are the Support Vector Machine (SVM), Gaussian Naive Bayes, and Recurrent Neural Network. The deep learning classifiers employed include the Convolutional and the Recurrent Neural Network (RNN). Even though there are numerous ways to classify malware, a machine learning technique may be the most efficient and successful. Its main goal is to give an overview of the machine learning method to malware classification by illustrating which of the mentioned classifiers can effectively classify malware depending on their reliability or accuracy. Recurrent neural networks were identified as the most accurate method based on the results of this study

Keywords

Malware identification, information, ML, Precision

Full Text

PDF

References