Skip to content

Request for clarification on preprocessing: splitting protein–ligand complexes into protein.pdb and ligand.mol2 #4

@ndlongvn

Description

@ndlongvn

Hi,
Thank you for sharing your great work and open-source code!

I have a question regarding the data preprocessing pipeline used to generate the model inputs — specifically, the separate protein.pdb and ligand.mol2 files.

In the released version of your model, the input format expects these two files independently, but most datasets (e.g., PDBBind, PubChem, or RCSB entries) provide only complex structures where the protein, ligand, and sometimes cofactors are combined in a single .pdb file.
It would be extremely helpful if you could share the script or function that performs this separation.

Thank you very much!

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions