Keep on to blur preview images; turn off to show them clearly

RL and efficient distributed pretraining • eXperiments lab • memes and training lores


Reasoning models coming (very) soon. Co-founder @pleiasfr


math provers remain my standard meter bar. if goedel hit sota on 32b, this is probably all you need to solve hardest problems at the moment.


interestingly, as use case become more complex and mature, first time i’m becoming constrained by model size. 30b dense or 50-150b active is likely becoming a sweet spot.


Reasoning models coming (very) soon. Co-founder @pleiasfr


Reasoning models coming (very) soon. Co-founder @pleiasfr
