Hi.
Fantastic work about discrete diffusion but i still have something not so clear.
You've mentioned in your paper that introducing a small uniform noise instead of totally masking helps preventing from model collapse. In the paper you said that the proof is in the supplementary. But in the supplementary I have only found the proof of Equation 8. I wonder where the actural proof of your claim is, or how to come up with your claim from the proof of Equation 8.
A million thanks.