08:00 - 09:10

Attendant Registration Time

09:10 - 09:30

Welcome Speech

09:30 - 09:55

Event Introduce

09:55 - 10:40
10:40 - 10:55

Break

10:55 - 11:40
11:40 - 12:40

Lunch

12:40 - 13:25
13:25 - 13:40

Break

13:40 - 14:25
14:25 - 14:55

Tea Time

14:55 - 15:40
15:40 - 15:55

Break

15:55 - 16:40
16:40 - 16:55

Break

16:55 - 17:40
17:40 - 17:50

Closing

Analysis of Invisible Data Poisoning Backdoor Attacks against Malware Classifiers

Malware classifiers developed using machine learning techniques have achieved high accuracy in detection performance, but this has led to other information security issues that make such classifiers the target of malicious attacks, such as adversarial or backdoor attacks. Backdoor attacks in neural networks are a new type of attack that modifies a small set of training data to cause the model to misclassify data with specific triggers, while maintaining the detection accuracy of normal data. Such an attack allows malware with triggers to bypass detection and may become a true backdoor malware on the victim's end. To achieve such a powerful attack in the malware detection domain, only a small amount of training sample content needs to be modified to the attacker's designed backdoor triggers. In this talk, we propose a data-poisoning backdoor attack against static malware classifiers by generating backdoor samples in a completely dataset-based manner to achieve an attack independent of the classifier model. We implement typical backdoor attacks and clean tagging backdoor attacks on binary-based files and feature-based classifiers, respectively. Instead of generating poisoning samples by machine learning interpretability techniques and gradients, we generate our poisoning samples by computing unique signature values in the training dataset that tend to be malicious spaces as backdoor triggers, and use such unique triggers to generate our poisoning samples. Triggers with high uniqueness have higher concealment and can be well hidden in the background data and results, which cannot be detected by simple analysis. Finally, we experimentally demonstrate that the use of unique trait values as backdoor triggers can be more stealthy than existing attack methods, and that the fully dataset-based backdoor attack we develop achieves excellent results on different models.