For example, normal_id_glm, or bernoulli_logit_glm. These are much faster due to reduced autodiff.
We may need to re-factor how our models are structured to best use these. Also, these assume a separate intercept, so we will need to avoid using X with a column of all ones/disable when using fit_intercept=False