LLaDA - Large Language Diffusion Model

10 128
8 64
0 1
0 2
8 64
Remasking Strategy
0 1