Emerging malware pose increasing challenges to detection systems as their variety and sophistication continue to increase. Malware developers use complex techniques to produce malware variants, by removing, replacing, and adding useless API calls to the code, which are specifically designed to evade detection mechanisms, as well as do not affect the original functionality of the malicious code involved. In this work, a new recurring subsequences alignment-based algorithm that exploits associative rules has been proposed to infer malware behaviors. The proposed approach exploits the probabilities of transitioning from two API invocations in the call sequence, as well as it also considers their timeline, by extracting subsequence of API calls not necessarily consecutive and representative of common malicious behaviors of specific subsets of malware. The resulting malware classification scheme, capable to operate within dynamic analysis scenarios in which API calls are traced at runtime, is inherently robust against evasion/obfuscation techniques based on the API call flow perturbation. It has been experimentally compared with two detectors based on Markov chain and API call sequence alignment algorithms, which are among the most widely adopted approaches for malware classification. In such experimental assessment the proposed approach showed an excellent classification performance by outperforming its competitors.

Association rule-based malware classification using common subsequences of API calls

Ficco M.
;
2021

Abstract

Emerging malware pose increasing challenges to detection systems as their variety and sophistication continue to increase. Malware developers use complex techniques to produce malware variants, by removing, replacing, and adding useless API calls to the code, which are specifically designed to evade detection mechanisms, as well as do not affect the original functionality of the malicious code involved. In this work, a new recurring subsequences alignment-based algorithm that exploits associative rules has been proposed to infer malware behaviors. The proposed approach exploits the probabilities of transitioning from two API invocations in the call sequence, as well as it also considers their timeline, by extracting subsequence of API calls not necessarily consecutive and representative of common malicious behaviors of specific subsets of malware. The resulting malware classification scheme, capable to operate within dynamic analysis scenarios in which API calls are traced at runtime, is inherently robust against evasion/obfuscation techniques based on the API call flow perturbation. It has been experimentally compared with two detectors based on Markov chain and API call sequence alignment algorithms, which are among the most widely adopted approaches for malware classification. In such experimental assessment the proposed approach showed an excellent classification performance by outperforming its competitors.
File in questo prodotto:
Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11591/452887
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 38
  • ???jsp.display-item.citation.isi??? 29
social impact