Static Malware Detection Using a Behavior-based Approach

Advanced Profile Hidden Markov Model (APHMM) is an HMM-based malware detection approach for modeling malware families specifically metamorphic and polymorphic ones. It includes tools such as Partial Order Alignment (POA) to produce graphs and construct Multiple Sequence Alignment (MSA) from feature sequences.

Icon

Project Repository

Our publications and source codes are available for public access on the Web.

Icon

Our Datasets

Datasets used in this study have been publicly released and available here.

Icon

References

Here are some references for further reading related to APHMM.

Icon

About APHMM Project

You can find more details here.

The APHMM method (step by step)


  • In this method, first, a sequence of extracted features is created for each file of a specific malware family. Each sequence is composed of the numbers identifying the unique extracted features of the file. Top n-1 features are mapped into numbers 1 to n-1 and number n is used for identifying unassigned features.

  • Multiple sequence alignment is created for each specific malware family. This is used for constructing a profile hidden Markov model. Note that in APHMM, we add to MSA, consensus sequences which are generated by the graph existed in POA and construct an optimal model.

  • The emission and transition matrices of each model are used to calculate the amount of unknown file's similarity score for each model by using the Viterbi algorithm, and finally, by comparing the scores obtained for that file, we determine that the under question file is more likely to belong to each family model and then classify it.