Skip to content

Poor variation granularity in graph creation from assemblies with minigraph-cactus #1853

@Mirkocoggi

Description

@Mirkocoggi

I'm trying to create a pangenome graph with minigraph-cactus from multiple assemblies (fasta) but I'm encountering a problem in the granularity of the variations identified.

In particular, sometimes big super bubbles are created with very similar alternative sequence nodes, instead of collapsing the common regions among them. I provide an example:

S 1 ACCGCTCGCGCGTTAC
S 2 ACCGCACGCGCGTTAC
S 3 ACCGCACGCGCGATAC

The three nodes are alternatives belonging to the same superbubble, but it is clear that the only differences are SNPs: T instead of A at pos 6 and A instead of T in position 13. It seems like some small variations are not identified and it chooses to create big bubbles of long alternative sequences.

I don't know if it could have helped but I tried to use the --collapse flag, which unfortunately didn't work (It crashed in multiple combination of settings).

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions