`TemplateGenerator._match_residue()` is too slow

The residue matching inside template generators used to load or check force field files uses NetworkX.  This is alright for small molecules, but performance is poorer for larger molecules.  I have been playing with this and found it to be very easy to construct pathological cases that lead to unacceptable performance, *e.g.*, NetworkX can take over a minute to distinguish between graphs for the 119-atom peptide `DPETGTWG` (chignolin[2-9]) and the same with the two final residues swapped in place.[^1]

Per a recent discussion, if we end up using single residue templates for biopolymers and deferring to `SystemGenerator` to handle the fact that OpenMM normally reads them as multi-residue chains, we will need to match large molecules.  Even if not, we should be able to handle small cases like the above much faster than currently, so I'm opening this issue since I'd like to address this at some point in the future.

[^1]: If you try to reproduce this and can't, note that it's oddly sensitive to the ordering of the atoms in the graphs you are trying to match, so it might or might not appear depending on how you construct the molecule.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

`TemplateGenerator._match_residue()` is too slow #426

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

TemplateGenerator._match_residue() is too slow #426

Description

Footnotes

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions

`TemplateGenerator._match_residue()` is too slow #426