Skip to content

Experiment with adversarial training with CleverHans #105

@yondonfu

Description

@yondonfu

All the models used by the verifier right now are trained on a data set that contains videos representing synthetic attacks (see this comment). These synthetic attack videos can be thought of as a set of adversarial examples that an adversary may pass as an input to fool the models (either to misclassify a video as tampered or not tampered). While at the moment there is no method for generating adversarial examples that can cover the entire range of attacks that an adaptive adversary can use, there are some methods for generating some adversarial examples that can help improve the robustness of models.

One such method involves using a fast gradient sign method to generate large batches of adversarial examples for training and then training the model to assign the same label to the adversarial example as the original data point (i.e. if the original data point was a correctly transcoded video, the adversarial example should also be classified as a correctly transcoded video) [1]. The CleverHans library might be useful for this.

This adversarial training process is likely only applicable to the SL model (can be found in this branch).

[1] http://www.cleverhans.io/security/privacy/ml/2017/02/15/why-attacking-machine-learning-is-easier-than-defending-it.html

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions