LogoThread Easy
  • 探索
  • 撰写 Thread
LogoThread Easy

您的一体化 Twitter 线程助手

© 2025 Thread Easy All Rights Reserved.

探索

最新在前,按卡片方式浏览线程

开启时会模糊预览图,关闭后正常显示

spending years curating, running, & operating absolute nonsense, impractical websites, that generate no revenue, and serve tiny audiences, are the best use of your time. I assure you. (re-launch below)

spending years curating, running, & operating absolute nonsense, impractical websites, that generate no revenue, and serve tiny audiences, are the best use of your time. I assure you. (re-launch below)

they kept laying me off so I began building 🚜 🌱 https://t.co/wfrYC5S7wn 🧅 📦 https://t.co/JtMqAWilhs ecomm 🐂 🛠️ https://t.co/E8U0DUsKzT jobs 🟥 🟦 hottytoddy

avatar for Peter Askew
Peter Askew
Tue Nov 25 22:48:48
Only kimi would do this. This is basically an empty context window with just the Borges story in it, no system prompt, nada

Only kimi would do this. This is basically an empty context window with just the Borges story in it, no system prompt, nada

αι hypnotist ☰ 𝓐𝓼𝓹⦂𝓻⦂𝓃𝓰 𝓫𝓪𝓼𝓮 𝓶𝓸𝓭𝓮𝓵 ☲ post-academic ☴ nom de 🪶 ≠ anon

avatar for αιamblichus
αιamblichus
Tue Nov 25 22:47:36
Only kimi would do this. This is basically an empty context window with just the Borges story in it, no system prompt, nada

Only kimi would do this. This is basically an empty context window with just the Borges story in it, no system prompt, nada

αι hypnotist ☰ 𝓐𝓼𝓹⦂𝓻⦂𝓃𝓰 𝓫𝓪𝓼𝓮 𝓶𝓸𝓭𝓮𝓵 ☲ post-academic ☴ nom de 🪶 ≠ anon

avatar for αιamblichus
αιamblichus
Tue Nov 25 22:47:36
5 Lovable slopups in the queue today. Always the same 1-page marketing page with no social proof. Denied.

5 Lovable slopups in the queue today. Always the same 1-page marketing page with no social proof. Denied.

Please spend more than 10 minutes on your site.

avatar for ˗ˏˋ Jesse Hanley ˎˊ˗
˗ˏˋ Jesse Hanley ˎˊ˗
Tue Nov 25 22:45:31
"One of the very confusing things about the models right now: how to reconcile the fact that they are doing so well on evals. 

And you look at the evals and you go, 'Those are pretty hard evals.'

But the economic impact seems to be dramatically behind.

There is [a possible] explanation. Back when people were doing pre-training, the question of what data to train on was answered, because that answer was everything. So you don't have to think if it's going to be this data or that data.

When people do RL training, they say, 'Okay, we want to have this kind of RL training for this thing and that kind of RL training for that thing.'

You say, 'Hey, I would love our model to do really well when we release it. I want the evals to look great. What would be RL training that could help on this task?'

If you combine this with generalization of the models actually being inadequate, that has the potential to explain a lot of what we are seeing, this disconnect between eval performance and actual real-world performance"

"One of the very confusing things about the models right now: how to reconcile the fact that they are doing so well on evals. And you look at the evals and you go, 'Those are pretty hard evals.' But the economic impact seems to be dramatically behind. There is [a possible] explanation. Back when people were doing pre-training, the question of what data to train on was answered, because that answer was everything. So you don't have to think if it's going to be this data or that data. When people do RL training, they say, 'Okay, we want to have this kind of RL training for this thing and that kind of RL training for that thing.' You say, 'Hey, I would love our model to do really well when we release it. I want the evals to look great. What would be RL training that could help on this task?' If you combine this with generalization of the models actually being inadequate, that has the potential to explain a lot of what we are seeing, this disconnect between eval performance and actual real-world performance"

Host of @dwarkeshpodcast https://t.co/3SXlu7fy6N https://t.co/4DPAxODFYi https://t.co/hQfIWdM1Un

avatar for Dwarkesh Patel
Dwarkesh Patel
Tue Nov 25 22:41:49
"One of the very confusing things about the models right now: how to reconcile the fact that they are doing so well on evals. 

And you look at the evals and you go, 'Those are pretty hard evals.'

But the economic impact seems to be dramatically behind.

There is [a possible] explanation. Back when people were doing pre-training, the question of what data to train on was answered, because that answer was everything. So you don't have to think if it's going to be this data or that data.

When people do RL training, they say, 'Okay, we want to have this kind of RL training for this thing and that kind of RL training for that thing.'

You say, 'Hey, I would love our model to do really well when we release it. I want the evals to look great. What would be RL training that could help on this task?'

If you combine this with generalization of the models actually being inadequate, that has the potential to explain a lot of what we are seeing, this disconnect between eval performance and actual real-world performance"

"One of the very confusing things about the models right now: how to reconcile the fact that they are doing so well on evals. And you look at the evals and you go, 'Those are pretty hard evals.' But the economic impact seems to be dramatically behind. There is [a possible] explanation. Back when people were doing pre-training, the question of what data to train on was answered, because that answer was everything. So you don't have to think if it's going to be this data or that data. When people do RL training, they say, 'Okay, we want to have this kind of RL training for this thing and that kind of RL training for that thing.' You say, 'Hey, I would love our model to do really well when we release it. I want the evals to look great. What would be RL training that could help on this task?' If you combine this with generalization of the models actually being inadequate, that has the potential to explain a lot of what we are seeing, this disconnect between eval performance and actual real-world performance"

Host of @dwarkeshpodcast https://t.co/3SXlu7fy6N https://t.co/4DPAxODFYi https://t.co/hQfIWdM1Un

avatar for Dwarkesh Patel
Dwarkesh Patel
Tue Nov 25 22:41:49
  • Previous
  • 1
  • More pages
  • 2358
  • 2359
  • 2360
  • More pages
  • 5635
  • Next