Ajou University repository

Fast k-NN based Malware Analysis in a Massive Malware Environmentoa mark
Citations

SCOPUS

3

Citation Export

Publication Year
2019-12-31
Publisher
Korean Society for Internet Information
Citation
KSII Transactions on Internet and Information Systems, Vol.13, pp.6145-6158
Keyword
ClusteringK-Nearest NeighborMalware
Mesh Keyword
ClusteringComputation speedK-nearest neighborsMalicious-code analysisSecurity industrySimilarity analysisSophisticated machinesUnstructured data
All Science Classification Codes (ASJC)
Information SystemsComputer Networks and Communications
Abstract
It is a challenge for the current security industry to respond to a large number of malicious codes distributed indiscriminately as well as intelligent APT attacks. As a result, studies using machine learning algorithms are being conducted as proactive prevention rather than post processing. The k-NN algorithm is widely used because it is intuitive and suitable for handling malicious code as unstructured data. In addition, in the malicious code analysis domain, the k-NN algorithm is easy to classify malicious codes based on previously analyzed malicious codes. For example, it is possible to classify malicious code families or analyze malicious code variants through similarity analysis with existing malicious codes. However, the main disadvantage of the k-NN algorithm is that the search time increases as the learning data increases. We propose a fast k-NN algorithm which improves the computation speed problem while taking the value of the k-NN algorithm. In the test environment, the k-NN algorithm was able to perform with only the comparison of the average of similarity of 19.71 times for 6.25 million malicious codes. Considering the way the algorithm works, Fast k-NN algorithm can also be used to search all data that can be vectorized as well as malware and SSDEEP. In the future, it is expected that if the k-NN approach is needed, and the central node can be effectively selected for clustering of large amount of data in various environments, it will be possible to design a sophisticated machine learning based system.
Language
eng
URI
https://dspace.ajou.ac.kr/dev/handle/2018.oak/31087
DOI
https://doi.org/10.3837/tiis.2019.12.019
Fulltext

Type
Article
Funding
This work was supported by the National Research Foundation of Korea(NRF) grant funded by the Korea government(MSIT) (No. NRF-2018R1C1B5029849 ) and by the National Research Foundation of Korea(NRF) grant funded by the Korea government(MSIT)(No. NRF-2017R1E1A1A01075110).
Show full item record

Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.

Related Researcher

KWAK, JIN Image
KWAK, JIN곽진
Department of Cyber Security
Read More

Total Views & Downloads

File Download

  • There are no files associated with this item.