I'm trying to create a pangenome graph with minigraph-cactus from multiple assemblies (fasta) but I'm encountering a problem in the granularity of the variations identified.
In particular, sometimes big super bubbles are created with very similar alternative sequence nodes, instead of collapsing the common regions among them. I provide an example:
S 1 ACCGCTCGCGCGTTAC
S 2 ACCGCACGCGCGTTAC
S 3 ACCGCACGCGCGATAC
The three nodes are alternatives belonging to the same superbubble, but it is clear that the only differences are SNPs: T instead of A at pos 6 and A instead of T in position 13. It seems like some small variations are not identified and it chooses to create big bubbles of long alternative sequences.
I don't know if it could have helped but I tried to use the --collapse flag, which unfortunately didn't work (It crashed in multiple combination of settings).