LogoThread Easy
  • Explorer
  • Composer un thread
LogoThread Easy

Votre partenaire tout-en-un pour les threads Twitter

© 2025 Thread Easy All Rights Reserved.

Explorer

Newest first — browse tweet threads

Keep on to blur preview images; turn off to show them clearly

RT @casper_hansen_: RL is slow and expensive while prompt optimization is fast and cheap. I'm not convinced yet that RL is the solution to…

RT @casper_hansen_: RL is slow and expensive while prompt optimization is fast and cheap. I'm not convinced yet that RL is the solution to…

We're in a race. It's not USA vs China but humans and AGIs vs ape power centralization. @deepseek_ai stan #1, 2023–Deep Time «C’est la guerre.» ®1

avatar for Teortaxes▶️ (DeepSeek 推特🐋铁粉 2023 – ∞)
Teortaxes▶️ (DeepSeek 推特🐋铁粉 2023 – ∞)
Fri Dec 05 04:44:33
“i told you so”
> We were surprised to find that Claude Code with Opus 4.5 dramatically outperformed the CORE-Agent scaffold, even without fixing incorrect test cases (78% vs 42%).
> We are unsure what led to this difference. One hypothesis is that the Claude 4.5 series of models is much better tuned to work with Claude Code.
> We think studying the coupling between models and scaffolds is an important research direction going forward

“i told you so” > We were surprised to find that Claude Code with Opus 4.5 dramatically outperformed the CORE-Agent scaffold, even without fixing incorrect test cases (78% vs 42%). > We are unsure what led to this difference. One hypothesis is that the Claude 4.5 series of models is much better tuned to work with Claude Code. > We think studying the coupling between models and scaffolds is an important research direction going forward

so many gigabrained takes at that time, people asking in posts and discussing in GCs about what’s the reason. but almost 9 months later, only one answer wins.

avatar for tokenbender
tokenbender
Fri Dec 05 04:42:26
This is unserious. V3.2-thinking, one of the strongest LLMs around, is below tons of relatively weak models and even older versions of itself, like V3.1, V3.2-exp, R1-0528. Maybe the clearest case of lmarena being cooked.

This is unserious. V3.2-thinking, one of the strongest LLMs around, is below tons of relatively weak models and even older versions of itself, like V3.1, V3.2-exp, R1-0528. Maybe the clearest case of lmarena being cooked.

We're in a race. It's not USA vs China but humans and AGIs vs ape power centralization. @deepseek_ai stan #1, 2023–Deep Time «C’est la guerre.» ®1

avatar for Teortaxes▶️ (DeepSeek 推特🐋铁粉 2023 – ∞)
Teortaxes▶️ (DeepSeek 推特🐋铁粉 2023 – ∞)
Fri Dec 05 04:40:52
RT @AlexGDimakis: Both GEPA and OpenThoughts got oral presentations at the FoRLM workshop in Neurips (This Sunday). Congratulations to the…

RT @AlexGDimakis: Both GEPA and OpenThoughts got oral presentations at the FoRLM workshop in Neurips (This Sunday). Congratulations to the…

Asst professor @MIT EECS & CSAIL (@nlp_mit). Author of https://t.co/VgyLxl0oa1 and https://t.co/ZZaSzaRaZ7 (@DSPyOSS). Prev: CS PhD @StanfordNLP. Research @Databricks.

avatar for Omar Khattab
Omar Khattab
Fri Dec 05 04:39:59
RT @isaiah_p_taylor: Today, we took the Nova Core critical for the final time. 3 weeks, 10 configurations, and 36 different critical and su…

RT @isaiah_p_taylor: Today, we took the Nova Core critical for the final time. 3 weeks, 10 configurations, and 36 different critical and su…

https://t.co/N3tfDNkGx4 | founder @trychroma

avatar for anton 🇺🇸
anton 🇺🇸
Fri Dec 05 04:35:59
DEEP is in effect the robotics wing of ZJU.  A pilgrimage there makes sense for everyone interested in the future of robotics (+ Shenzhen and Shanghai)

DEEP is in effect the robotics wing of ZJU. A pilgrimage there makes sense for everyone interested in the future of robotics (+ Shenzhen and Shanghai)

We're in a race. It's not USA vs China but humans and AGIs vs ape power centralization. @deepseek_ai stan #1, 2023–Deep Time «C’est la guerre.» ®1

avatar for Teortaxes▶️ (DeepSeek 推特🐋铁粉 2023 – ∞)
Teortaxes▶️ (DeepSeek 推特🐋铁粉 2023 – ∞)
Fri Dec 05 04:17:34
  • Previous
  • 1
  • More pages
  • 1540
  • 1541
  • 1542
  • More pages
  • 5634
  • Next