LogoThread Easy
  • 탐색
  • 스레드 작성
LogoThread Easy

트위터 스레드의 올인원 파트너

© 2025 Thread Easy All Rights Reserved.

탐색

Newest first — browse tweet threads

Keep on to blur preview images; turn off to show them clearly

Q: My instinct is to avoid extra libraries unless absolutely necessary. Really, really don't like Triton from what I see, for instance (though I'd be less annoyed if it would generate the kernels once which I could then include statically in my project). I do need some level of tile size tuning. What do?

Q: My instinct is to avoid extra libraries unless absolutely necessary. Really, really don't like Triton from what I see, for instance (though I'd be less annoyed if it would generate the kernels once which I could then include statically in my project). I do need some level of tile size tuning. What do?

I build sane open-source RL tools. MIT PhD, creator of Neural MMO and founder of PufferAI. DM for business: non-LLM sim engineering, RL R&D, infra & support.

avatar for Joseph Suarez 🐡
Joseph Suarez 🐡
Fri Nov 07 13:09:05
Q: So far, fp32 kerns are pretty easy. Pretty much just writing C. What's the easiest way to do TF32, FP16, BF16 support without making a bloody mess?

Q: So far, fp32 kerns are pretty easy. Pretty much just writing C. What's the easiest way to do TF32, FP16, BF16 support without making a bloody mess?

Q: My instinct is to avoid extra libraries unless absolutely necessary. Really, really don't like Triton from what I see, for instance (though I'd be less annoyed if it would generate the kernels once which I could then include statically in my project). I do need some level of tile size tuning. What do?

avatar for Joseph Suarez 🐡
Joseph Suarez 🐡
Fri Nov 07 13:09:05
I just find it weird when people set up a camera to come online and then cry. I see TikToks of it, it just never sits well, and half the time, it's borderline psycho people.

I just find it weird when people set up a camera to come online and then cry. I see TikToks of it, it just never sits well, and half the time, it's borderline psycho people.

Founder | Author | Speaker Building @beltstripe. Healtech/EdTech/Agric I'm Not The Man Of Your Dreams. Your Imagination Wasn't This Great.

avatar for Sani Yusuf
Sani Yusuf
Fri Nov 07 13:08:43
潮流周刊居然已经 243 期了,到今年差不多持续更新 5 年了,主要是更新看到的工程师好用的工具,开源产品,以及我的随便看看,还有随便说说,欢迎新朋友关注和订阅 RSS,话说你是什么时候知道潮流周刊的?
https://t.co/8abZ9vxSJk

潮流周刊居然已经 243 期了,到今年差不多持续更新 5 年了,主要是更新看到的工程师好用的工具,开源产品,以及我的随便看看,还有随便说说,欢迎新朋友关注和订阅 RSS,话说你是什么时候知道潮流周刊的? https://t.co/8abZ9vxSJk

Father of Pake • MiaoYan • Mole • XRender

avatar for Tw93
Tw93
Fri Nov 07 13:07:59
RT @ElKomnrs: @skominers Wordle 1,602 1/6
🙏

🟩🟩🟩🟩🟩

https://t.co/KvFo35FTy0

RT @ElKomnrs: @skominers Wordle 1,602 1/6 🙏 🟩🟩🟩🟩🟩 https://t.co/KvFo35FTy0

Market Design/Entrepreneurship Professor @HarvardHBS & Faculty Affiliate @Harvard Economics; Research @a16zcrypto; Editor @restatjournal; Econ @Quora; … | #QED

avatar for Scott Kominers
Scott Kominers
Fri Nov 07 13:05:15
Each release like this is such a compete humiliation and indictment of Meta, which pioneered open-weight LLMs with the first Llama model introduced in February of 2023. 

They’ve likely invested 100x to 1000x the resources (money, compute, PhD headcount, square footage, etc.) on a cumulative basis compared to any of these other Chinese labs (Kimi, Z, Qwen, DeepSeek, etc.). 

By all rights, they should be way ahead of everyone else. And yet they haven’t had a state of the art open-weight model, or even a modestly compelling model, since Llama 3.3, which was released at the end of 2024, nearly a year ago.

Now the Chinese labs have been leapfrogging each other like crazy, so that the latest models are extremely capable now.

Could Meta still pull a rabbit out of a hat and leapfrog these other labs as a result of all the brilliant people they’ve hired at great expense over the last few months? 

I suppose they could, but even then they’d likely be getting far, far less bang for the buck compared to these half a dozen or so Chinese labs.

This would be like if the Soviets stole the nuclear bomb secrets and then ended up testing a hydrogen bomb 2 years before the US did. Unthinkable.

Makes it a lot more understandable why Zuck has been doing a massive purge of the organization. I would want to clean house, too, in this case. And better to err on the side of caution and cut deeper to be sure you’ve removed all the rot.

Each release like this is such a compete humiliation and indictment of Meta, which pioneered open-weight LLMs with the first Llama model introduced in February of 2023. They’ve likely invested 100x to 1000x the resources (money, compute, PhD headcount, square footage, etc.) on a cumulative basis compared to any of these other Chinese labs (Kimi, Z, Qwen, DeepSeek, etc.). By all rights, they should be way ahead of everyone else. And yet they haven’t had a state of the art open-weight model, or even a modestly compelling model, since Llama 3.3, which was released at the end of 2024, nearly a year ago. Now the Chinese labs have been leapfrogging each other like crazy, so that the latest models are extremely capable now. Could Meta still pull a rabbit out of a hat and leapfrog these other labs as a result of all the brilliant people they’ve hired at great expense over the last few months? I suppose they could, but even then they’d likely be getting far, far less bang for the buck compared to these half a dozen or so Chinese labs. This would be like if the Soviets stole the nuclear bomb secrets and then ended up testing a hydrogen bomb 2 years before the US did. Unthinkable. Makes it a lot more understandable why Zuck has been doing a massive purge of the organization. I would want to clean house, too, in this case. And better to err on the side of caution and cut deeper to be sure you’ve removed all the rot.

Former Quant Investor, now building @lumera (formerly called Pastel Network) | My Open Source Projects: https://t.co/9qbOCDlaqM

avatar for Jeffrey Emanuel
Jeffrey Emanuel
Fri Nov 07 13:05:11
  • Previous
  • 1
  • More pages
  • 567
  • 568
  • 569
  • More pages
  • 2117
  • Next