探索 | Thread Easy - 展开 Twitter 线程｜阅读、总结与创作

proof-of-concept method that trains models to report when they break instructions or take unintended shortcuts:

President & Co-Founder @OpenAI

Greg Brockman

Wed Dec 03 21:28:59

We just released our first Hermes model trained entirely with Distro on Psyche, Hermes 4.3 on ByteDance Seed 36B! Outcome was actually better than the centralized comparison run, and even brought it to top spot on the RefusalBench leaderboard, weights on HF! Check it out:

Cofounder and Head of Post Training @NousResearch, prev @StabilityAI Github: https://t.co/LZwHTUFwPq HuggingFace: https://t.co/sN2FFU8PVE

Teknium (e/λ)

Wed Dec 03 21:28:01

RT @iwiwi: At #NeurIPS2025, our team at @SakanaAILabs will be presenting two posters on agents and inference-time scaling for challenging c…

Building @SakanaAILabs 🧠

hardmaru

Wed Dec 03 21:26:34

RT @kimmonismus: Google cooked so hard. Not gonna lie, this feels like the future is here. Now develop Google Glasses with enough battery…

Founder 📈 @parqetapp Host of 🎙 @minimalempires Prev. @stripe

Sumit Kumar

Wed Dec 03 21:23:09

but this feels like it has to be right in the end! 1. unsupervised learning on everything to understand the world. 1st person, 3rd person, car cameras, 2d animation, cctv, instructional videos, text, images, any and all robotics data, etc. 2. that should transfer downstream to a model you finetune with teleoperation data. your robotics model uses its deep latent understanding of “what” a coffee mug really is and what it is used for to understand your human demonstrations. also finetuning in a motor control and action head shouldn’t be hard here if data not in pretraining 3. a bit of real world on-policy RL with your model deployed in the wild (or some in sim/in lab) is what you need to seal the deal.

dei ex machina @openai, past: posttraining o3/4o, sora 1 & 2, applied research

will depue

Wed Dec 03 21:17:56

We're expanding our partnership with @Snowflake in a multi-year, $200 million agreement. Claude is now available to more than 12,600 Snowflake customers, helping businesses to quickly and easily get accurate answers from their trusted enterprise data, while maintaining rigorous security standards. Read more: https://t.co/4pTJBtF4E6

We're an AI safety and research company that builds reliable, interpretable, and steerable AI systems. Talk to our AI assistant @claudeai on https://t.co/FhDI3KQh0n.

Anthropic

Wed Dec 03 21:15:34

探索

最新在前，按卡片方式浏览线程

探索

最新在前，按卡片方式浏览线程

proof-of-concept method that trains models to report when they break instructions or take unintended shortcuts:

We just released our first Hermes model trained entirely with Distro on Psyche, Hermes 4.3 on ByteDance Seed 36B! Outcome was actually better than the centralized comparison run, and even brought it to top spot on the RefusalBench leaderboard, weights on HF! Check it out:

RT @iwiwi: At #NeurIPS2025, our team at @SakanaAILabs will be presenting two posters on agents and inference-time scaling for challenging c…

RT @kimmonismus: Google cooked so hard. Not gonna lie, this feels like the future is here. Now develop Google Glasses with enough battery…