Skip to content

Compress with pbzip2 #1

@ltworf

Description

@ltworf

I think that rather than having a compression module, the code could just run an external compression program and then read back from it.

This would allow us to use more diverse tools.

I think we should use pbzip2 for this, since db servers have many cores.

I run this experiment:

# asd is a file filled with random data, sized 149MiB
$ time (cat asd | pbzip2 > asd.bz2)

real    0m6.155s
user    0m40.856s
sys     0m0.660s

salvo@vulcano /tmp$ time bzip2 asd 

real    0m21.327s
user    0m20.692s
sys     0m0.156s
salvo@vulcano /tmp$ 

As you can see pbzip is clearly faster, even on streamed input, not just on mappable files.
This would benefit, by reducing the backup time.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions