Skip to content
This repository was archived by the owner on Oct 10, 2019. It is now read-only.
This repository was archived by the owner on Oct 10, 2019. It is now read-only.

Add pdfparser2 module #246

@KarmaPenny

Description

@KarmaPenny

I created a pdfparser in golang that does everything the existing pdfparser does and much much more, plus its like 30x faster. Details on it can be found here

Usage:

pdfparser -f input.pdf output/

The above command creates the following files in the output dir:

  • commands.txt - list of commands run by launch actions
  • contents.txt - the text content of the pdf (can be scripts and contain urls etc.)
  • errors.txt - list of format errors and abnormalities that we might be able to detect on
  • files.txt - list of md5 hash and path of referenced embedded and external files. Embedded files are extracted to the output dir using the md5 as the file name.
  • javascript.js - javascript of all actions in the pdf
  • raw.pdf - a decrypted and decoded version of the pdf
  • urls.txt - list of urls referenced by actions

We should create an ace module that scans all the above files with appropriate yara rules. We may also want to add some of the info in the above files as observables, like embedded files, file paths, urls etc

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions