Complex graph visualization

This page assumes you are already familiar with the content of Introduction to visualization; in particular, you should already understand the sequence graph representation used.

Extending `vg view -d` to visualize paths

The graph isn't just nodes and edges. We also have paths, which are a critical component of the variation graph reference architecture. We can render them in a few ways by combining the -d flag with -p (show paths as external subgraphs), -n (label edges), and -w (walk edges, adding a new edge to the graph for each path between two nodes).

Here they are in action on a trivial test graph produced in the introduction's example (vg construct -v tiny/tiny.vcf.gz -r tiny/tiny.fa):

vg view -dp

vg view -dn

screenshot from 2015-10-28 14 11 25

vg view -dw

screenshot from 2015-10-28 14 12 07

They can be combined. I find vg view -dpn to be very useful:

screenshot from 2015-10-28 14 12 51

Path labeling in visualization

Right now you may be wondering where the grey saxophone is coming from. For visualizing many graphs vg uses 766 unicode pictographs (emoji) and the 8 colors of the Brewer dark28 palette to generate 6128 possible color/symbol combinations that allow easy differentiation of the paths based on tiny symbols. The color/symbol combination is chosen on the basis of a hash of the path name, so as long as the path names are unique we should typically expect unique symbolic identifiers within graphs with reasonable numbers of paths.

This doesn't really help for single paths, but as the number of paths increases it can really help debugging. For instance, this is a fragment of the MHC which has 9 haplotypes:

(You can render this with vg msga -f GRCh38_alts/FASTA/HLA/K-3138.fa -B 256 -k 22 -K 11 -X 1 -E 4 -Q 22 -D | vg view -dpn - | dot -Tsvg -o K-3138.svg.)

Viewing alignments

Given the importance of alignments in sequence analysis, it should be easy to view them. Existing tools that work on linear references really won't cut it in the graph world. A solution is to treat the alignments like paths and add some visual indicators that help us interpret the alignment orientation and mismatches between the alignment and the graph.

If you have your alignments in GAM (graph alignment / map) format, you can visualize them against a graph using another extension to vg view -d:

# Construct graph
vg construct -v tiny/tiny.vcf.gz -r tiny/tiny.fa > t.vg
# Index graph
vg index -x t.xg -g t.gcsa -k 11 t.vg
# Generate reads 
vg sim -l 20 -n 10 -e 0.05 -i 0.02 -x t.xg t.vg > t.reads
# Align reads
vg map -T t.reads -x t.xg -g t.gcsa > t.gam
# View alignment
vg view -d t.vg -A t.gam | dot -Tsvg -o aln.svg

The result shows blue segments for exact matches, yellow for mismatches, and green and purple ends to the alignments to indicate their relative orientation.

Large graphs

When working with a larger graph (for example, a whole-genome graph), it can become impractical to try and render the entire graph at once. Even if you were willing to wait around for vg to load the whole graph into memory and convert it to dot format, Graphviz will unceremoniously segfault when asked to lay out graphs beyond a certain size, and there are not many good tools for working with enormous SVG images.

Visualizing subgraphs

One approach to dealing with large graphs is to subset them down to just the part you are interested in, and to draw that. If you are interested in multiple subregions of the same graph, the best way to do this is to make an xg index of the graph with vg index and to pull out and visualize subgraphs around interesting nodes or path regions with vg find. You can do that like this:

# Build an xg index of the graph
vg convert --gfa-in graphs/cactus-BRCA2.gfa --xg-out > graph.xg

# Extract a subgraph centered on node 490 (-n 490), looking out
# a context of 3 nodes (-c 3) and visualize it with paths.
vg find -x graphs/cactus-BRCA2.gfa  -n 490 -c 3 | vg view -dp - | dot -Tsvg -o subgraph.svg

# View a subgraph around multiple nodes of interest.
# Note that if the contexts extracted for the source nodes don't
# touch you will get a graph with multiple disconnected pieces.
# Paths may end up jumping from piece to piece in a misleading way.
vg find -x graph.xg -n 490 -n 506 -c 3 | vg view -dp - | dot -Tsvg -o subgraph2.svg

# View a subgraph centered on part of a path ("ref", from position 4900 to position 5000)
vg find -x graph.xg -p 13:4900-5000 -c 3 | vg view -dp - | dot -Tsvg -o subgraph3.svg

If you have a vg file and you are only wanting to pull out a single subgraph, it can be easier to use vg mod to directly subset your graph, without making an xg index:

vg convert --gfa-in graphs/cactus-BRCA2.gfa > graph.pg
# Extract a subgraph centered on node 490 (-g 490), looking out
# a context of 3 nodes (-x 3) and visualize it with paths.
vg mod -g 490 -x 3 graph.pg | vg view -dp - | dot -Tsvg -o subgraph.svg

Visualizing large subgraphs

If you find that you do want to draw more graph at a time than dot can handle, you can use neato, also from graphviz, to draw the graph. In addition to neato you can greatly reduce layout time by running the mars graph layout algorithm prior to feeding the graphviz format graph into neato.

Using `vg viz`

If you have a large graph in xg format, and you really do want to lay it all out in one image, you can use vg viz to draw it. vg viz uses a simple linear layout of all the nodes, which it can precalculate before rendering anything, and so it can be much faster. You can run it like this:

vg viz -x graph.xg --out graph.svg

Complex graph visualization

Extending vg view -d to visualize paths

Path labeling in visualization

Viewing alignments

Large graphs

Visualizing subgraphs

Visualizing large subgraphs

Using vg viz

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Clone this wiki locally

Extending `vg view -d` to visualize paths

Using `vg viz`