Skip to content

Conversation

@casblaauw
Copy link
Collaborator

@casblaauw casblaauw commented Dec 19, 2025

This PR reworks CREsted's plotting functions to be axis-focused according to a few core principles, inspired by scanpy and seaborn. Just in time for Christmas 😅

Core principles

  • Working with an ax: Core plotting functions should accept an axis and some data and plot the data on that axis.

    • This allows for composite plots and easy adding of extra annotations/elements from the user's side, by far my greatest frustration currently.
      • Some plots (inherently multi-panel plots, clustermaps) are of course exempt.
    • If no axis is provided, it should create a sensible-sized plot for the user (as it does now already), returning both the fig and axis (if show=False).
    • If multiple values are provided (e.g. predictions from multiple models), it should automatically create a figure with multiple axes (like sc.pl.umap(colors=['var1', 'var2']).
    • Labels/titles should also be set on axes rather than figures preferably, especially on single-axis plots. Both because they don't look misaligned like suptitle does, but also because a function should not disturb the larger figure without being explicitly instructed to (through suptitle/sup[x/y]label).
  • Customizing the plotting function: The underlying plotting functions should be exposed through a plot_kws argument (like seaborn does for complicated functions with e.g. sns.lmplot(scatter_kws={}))

    • Default arguments in the plot should also be overridable with this, i.e. color in pl.hist.distribution, without requiring all of them to be separate plotting function arguments (which can become overwhelming).
  • Unified syntax: All plotting functions should use an identical syntax, aligning with each other and with matplotlib.

    • Figure size is now always set with width and height, and is set on plot creation rather than post-hoc resizing.
    • All plots should use render_plot unless really not possible, and things like setting axis labels, titles, tick label rotations, etc, should as much as possible be handed off to render_plot by putting things in kwargs in the plotting functions. (This allows users to prevent any change to pre-existing properties by simply setting property=None in the plotting function)
    • Things like separate unique figsave arguments are all unified to use save_path in render_plot (or manually if it's one of the few functions that doesn't use it).

High-level summary:

  • All functions now use render_plot (except a few pl.patterns modisco clustermaps)
  • Almost all functions now accept an axis to plot their data on, if plotting a single panel.
  • Almost all functions now accept plot_kws to add and customize the underlying plotting function's arguments.
  • All functions take width and height to set dimensions, and multi-plot functions also take sharex/sharey.
  • render_plot now can also set ax-level labels and titles, set x/y lims, and add a grid.
  • Default axis labels now denote whether you're using log_transform or not.

Complete(ish) changelist:

  • bar:
    • All (normalization_weights, region, region_predictions, prediction) now take plot_kws and all (except region_predictions) also an axis.
    • region_predictions now uses region to plot its components
    • prediction is cleaned up and made consistent with other functions
    • All barplots now use a y-only grid by default, since an x-grid is superfluous with a categorical axis.
  • heatmap
    • All (correlations_self, correlations_predictions) now take an axis and plot_kws.
    • Colormap is now customizable.
    • Colorbar now has a label to show its units (pearson correlation), indicating log1p-transformation if used.
    • Heatmaps are now square (sns.heatmap(square=True)) by default, and default fig size was slightly changed to make it fit a square heatmap + a colorbar well.
  • hist
    • The only function, hist.distribution now takes plot_kws and an optional axis.
    • custom argument share_y is now renamed to sharey, as in matplotlib.
    • Add nice default axis labels, including denoting log-transformation if used.
    • Non-used plots in the plot grid (if plotting multiple classes) are now hidden by default.
  • locus
    • locus.locus_scoring now takes an axis (if only plotting the locus scoring and not the extra bigwig track) and separate plot_kws for both the locus and bigwig plots. Previous custom arguments are now folded into the plot_kws or render_plot kwargs. Highlights can now also be customized with highlight_kws.
    • locus.track was expanded from the beta function I implemented at some point.
      • Accepts an axis and plot_kws.
      • Now accepts standard track model outputs and a class_idx, instead of requiring the user to subset dimensions before passing in the data.
      • Automatically creates multiple axes for every class provided.
  • patterns
    • contribution_scores:
      • Now takes an axis (if only plotting one sequence/class). Also now takes width/height, sharex/sharey.
      • Class labels are updated to be consistently at 70% of plot height, rather than at 70% of the positive values (which made them inconsistent if negative values in the data), and at 2.5% of the plot width instead of at x=5 (which is the same at default zoom level 200bp, but can vary if zoom level is changed)
      • Highlights can now be customized with highlight_kws.
    • _enhancer_design:
      • enhancer_design_steps_predictions now takes an axis (if plotting one class) and plot_kws. Spelling mistake in the arguments fixed. Now always creates a square grid of plots if supplying a lot of classes, following hist.distributions.
      • enhancer_design_steps_contribution_scores now takes sharex/sharey and highlights can now be customized with highlight_kws.
    • _modisco_results : These plots are more convoluted/specific (and I very rarely use them), so I didn't touch them beyond the basics.
      • All functions now take width/height, and the non-clustermap functions now all use render_plot. Clustermap functions now use g.savefig() as recommended by seaborn instead of fig.savefig.
      • clustermap_with_pwm_logos pwm positioning logic was slightly adjusted, since they were all overlapping on my test run. Now they're all neatly aligned and separated in my tests at least.
      • selected_instances now takes an axis if plotting a single index.
      • All clustermaps/heatmaps in this module should now have cmap as an argument.
  • scatter
    • class_density can now be customized more and has better defaults (figsize mostly square with or without colorbar, colorbar off by default, nicer labels)
    • class_density now has properly colored and properly ranged colorbar.
  • violin:
    • violin.correlations now takes plot_kws and ax. Label adjusted if using log-transformed data.
  • render_plot
    • Now primarily axis-focused, taking and returning axes, and only disturbing the figure if explicitly asked to. (Fig resizing moved to plot creation, rather than post-hoc, to follow this rule).
    • Can now set axis titles, x/ylabels, and limits. Can handle both a single value (applying that to all axes) and a list of values (one per axis).
    • Rotated labels now align with their ticks, optimized to some heuristics. Primarily important with longer cell type names.
    • [x/y]_*arguments aligned with matplotlib and setting arguments (e.g. xlabel's fontsize is now set by xlabel_fontsize rather than x_label_fontsize, also to prevent supx_label_fontsize which looks weird).
    • rotation arguments renamed to [x/y]tick, since x/ylabel refers to the axis labels, not the axis tick labels.
    • Can now add a grid with nice defaults (behind data). Works both for single-axis and both axes.
  • create_plot
    • New function to replace plt.subplots calls, shorthand for if ax is not None; fig = ax.get_figure(); else; fig, ax = plt.subplots()

Compatibility

I've endeavored to keep code as reverse compatible as possible.

  • All renamed arguments still work, and raise a warning on how to use them with the renamed version or new syntax.
  • If using show=False, render_plot does now return both a fig and ax(s), so code previously doing fig = crested.pl.func(show=False); axs = fig.axes or something similar will have to update to fig, axs = crested.pl.func(show=False).
  • title as a kwarg now refers to the axis title rather than suptitle; suptitle's now under suptitle. This leads to better titles and nicer plots in 90% of cases, but might need some manual changes if doing multi-panel plots where you expected suptitle.
  • I've tested all base functions (everything except modisco_results) pretty thoroughly (also adjusting plot_kws, etc), but something might've slipped through.
    • for _modisco_results , I tested that all functions at least work with an old CREsted-based modisco run I had lying around, but haven't played with parameters a lot. Did not test the two TF expression-based plots (tf_expression_per_cell_type & clustermap_tf_motif), since I didn't have an elegans TF list available, so anyone testing those is appreciated.

Future work

  • Add tests for all plotting functions! Was once attempted in Render plot used everywhere #66.
  • bar.region/bar.prediction automatically also plotting multiple regions?
  • Add range to contribution_scores to plot on genomic coordinates (like with track())?
  • Expand track() with stuff from other functions, like center-zoom and highlights from contribution_scores -> see Expand pl.track.locus #161
  • Look into also using render_plot for clustermaps?
  • Think about log_and_raise: currently looks bad in notebooks because it duplicates the error message, and makes errors uncatchable with try/except. Not sure what the advantages are.

This is the first (and biggest) part of a plotting overhaul. The next parts will add some new plots, rework plot categorisation, and update all tutorials.

@casblaauw
Copy link
Collaborator Author

Ah shoot, comma'd arguments in docstrings (like sharex, sharey) currently repeat the same message for both in the docs. I was hoping they'd be shared on one line. I should probably split those out into separate arguments with separate explanations then.

@casblaauw
Copy link
Collaborator Author

This should be ready to review. Once uv learns how to install packages again, the tests should pass; they are passing for me locally, at least.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants