Skip to content

Conversation

@thundergolfer
Copy link
Collaborator

@thundergolfer thundergolfer commented May 2, 2025

TODOs

  • Complete training by continuing from the checkpoint in MODAL_PROFILE=modal_labs MODAL_ENVIRONMENT=nathan-dev modal volume ls sd-checkpoints
  • Test for linear scaling by going from 1, to 2, to 4 nodes. Match the scaling MosaicML got: https://claude.ai/share/d1a61af1-d017-4552-be59-8c2a3029cc8b
  • Setup a run_inference modal.Function to see what the trained model can do.
  • Get https://github.com/mosaicml/diffusion?tab=readme-ov-file#offline-eval working (⚠️ low prio, code doesn't work)
  • Cut unnecessary complexity and cruft out. There's too much code and config, much of it unused.
    • Getting offline eval working was so frustrating because of how much jank code there is in this.

@thundergolfer thundergolfer changed the title Jonathon/sd sd2 May 2, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant