LogoThread Easy
  • Explorar
  • Componer hilo
LogoThread Easy

Tu compañero integral para hilos de Twitter

© 2025 Thread Easy All Rights Reserved.

Explorar

Newest first — browse tweet threads

Keep on to blur preview images; turn off to show them clearly

Step 1. Flip a flag (I can do that!)
Step 2. Inspect all kernels (ok, I can do that.) 
Step 3. Write a custom backwards kernel 🥹

Step 1. Flip a flag (I can do that!) Step 2. Inspect all kernels (ok, I can do that.) Step 3. Write a custom backwards kernel 🥹

Ok the kernel impl isn’t that bad: https://t.co/X6qAqIXbJg

avatar for finbarr
finbarr
Thu Nov 13 03:04:23
The MixAttention blog post from Databricks/Mosaic is great:

https://t.co/0PGr4k9e6q

The MixAttention blog post from Databricks/Mosaic is great: https://t.co/0PGr4k9e6q

modeling language at @allen_ai

avatar for finbarr
finbarr
Sun Nov 09 17:04:14
RT @DBahdanau: can someone explain to me int4 training by @Kimi_Moonshot ? does it mean weights are stored in int4 and dequantized on the f…

RT @DBahdanau: can someone explain to me int4 training by @Kimi_Moonshot ? does it mean weights are stored in int4 and dequantized on the f…

modeling language at @allen_ai

avatar for finbarr
finbarr
Fri Nov 07 16:03:27
What are good MBU numbers for LLM inference? Do people report this anywhere?

What are good MBU numbers for LLM inference? Do people report this anywhere?

modeling language at @allen_ai

avatar for finbarr
finbarr
Fri Nov 07 14:41:25
RT @hamishivi: to continue the PipelineRL glazing, @finbarrtimbers  implemented PipelineRL for open-instruct a little bit ago and it ended…

RT @hamishivi: to continue the PipelineRL glazing, @finbarrtimbers implemented PipelineRL for open-instruct a little bit ago and it ended…

modeling language at @allen_ai

avatar for finbarr
finbarr
Thu Nov 06 17:11:18
thanks to the excellent work from the @vllm_project team, it was easy to implement! 

it's egregious that PipelineRL was rejected from NeurIPS. When I describe how inflight updates works to many people, they insist it's broken and can't work. it is quite novel.

thanks to the excellent work from the @vllm_project team, it was easy to implement! it's egregious that PipelineRL was rejected from NeurIPS. When I describe how inflight updates works to many people, they insist it's broken and can't work. it is quite novel.

modeling language at @allen_ai

avatar for finbarr
finbarr
Thu Nov 06 17:11:10
  • Previous
  • 1
  • 2
  • 3
  • 4
  • Next