Skip to content

We bring the spirit of nanogpt-speedrun into the omni-modal world

Notifications You must be signed in to change notification settings

Open-Model-Initiative/imagegen-speedrun

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

15 Commits
 
 
 
 

Repository files navigation

ImageGen-Speedrun

We bring the spirit of nanogpt-speedrun into the omni-modal world

Motivation

We have witnessed an impressive improvements on Diffusion models, especially with encoder improvements like REPA/E-REPA/RAE/REG, and also other areas of enhancement. Triggered by the experiment of SpeedRun-DiT, we as an open source community think it might be a worthy attempt to construct a speedrun (in the spirit of the great nanogpt-speedrun and marin-speedrun) for diffusion models and multi-modal generation tasks.

Ruleset -- 2026H1

  • ImageGen SpeedRun Ruleset : Interested parties should make their submission in accordance to the first version of general ruleset. We are looking at the first submissions targeting two directions: Basic setup with bare minimum ViT and Innovations with external modules(such as RAE/REG/etc...)
  • Baseline implementation for speedrun improvement on the external modules : Interested parties should look to the latest SR-DiT implementation for REG for reference if they don't have a perticular reference implementation of a minimum setup in mind.

Launch Preparation Community Discussion Trails

[2026.01.23]

From the community discussion, it was agreed that the spec that we used for submission metric to merge to one basic rule set while maintaining the flexibility of allowing free-form submission.

For the baseline code base, for the improvement on the external modules (vision encoders, etc), for the initial release the submittor could look to the latest SR-DiT implementation for REG for reference if they don't have a perticular reference implementation of a minimum setup in mind.

For the basic simplistic setup, we call for proposal in lieu of nano-vits.

[2026.01.20]

From the community discussion, we intend to kickstart the Imagegen SpeedRun with two tracks:

  • basic track that facilitates rapid iteration on a simple ViT/DiT structured model on limited hardware setting
  • external-module track that empowering the recent rapid research on REPA-series of innovations with large size of dataset and lab-level hardware setting.

However it is also agreed that although the two proposed tracks each targets a different direction and scenario, the criteria and metric we use should be largely aligned just with certain caveats for each option. We should avoid diverge of two completely different measurement systems

[2026.01.16]

Note that this effort is purely grassroot with the support of LFAI&Data Foundation for Open Model Initiative, it will undergo some dramatic changes, but we intend to document every dicussion publicly and making the process as transparent as possible

List of TODOs

  • Assembly of initial technical expert teams
  • Draft proposal of the measurement criteria of the initial tracks
  • First version of the track requirements published.
  • Confirmation of the first round CFP announcement (content, time, place)
  • Confirmation of the hardware resources
  • Confirmation of the Track review team (for cycle 2026H1)
  • Welcome and start review of the first submission

About

We bring the spirit of nanogpt-speedrun into the omni-modal world

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors