LogoThread Easy
  • Explorar
  • Criar thread
LogoThread Easy

Seu parceiro completo para threads do Twitter

© 2025 Thread Easy All Rights Reserved.

Explorar

Newest first — browse tweet threads

Keep on to blur preview images; turn off to show them clearly

RT @michielsdj: Excited to release our first model. We've been working on it for a while, and it came out of the oven pretty well! I've bee…

RT @michielsdj: Excited to release our first model. We've been working on it for a while, and it came out of the oven pretty well! I've bee…

@cursor_ai, created https://t.co/n8cSXZO4VH, started @SupermavenAI and @Tabnine, formerly @OpenAI

avatar for Jacob Jackson
Jacob Jackson
Wed Oct 29 17:34:12
'ON-THE-JOB TRAINING': Cuomo takes swipe at Democrat Zohran Mamdani in closing pitch to NYC voters

'ON-THE-JOB TRAINING': Cuomo takes swipe at Democrat Zohran Mamdani in closing pitch to NYC voters

Read more:

avatar for Fox News
Fox News
Wed Oct 29 17:34:02
Before and after photo of a startup founder:

Before and after photo of a startup founder:

Founder and CEO of @acquiredotcom. https://t.co/wRMIssDmhl has helped 100s of startups get acquired and facilitated $500m+ in closed deals.

avatar for Andrew Gazdecki
Andrew Gazdecki
Wed Oct 29 17:34:01
RT @dirtyuncleleo: @SperglerAcolyte It was $1.50 yesterday. The returns on $snap are insane

RT @dirtyuncleleo: @SperglerAcolyte It was $1.50 yesterday. The returns on $snap are insane

AI Optimist. Empiricist, not 'rationalist'. Anti world government.

avatar for renji
renji
Wed Oct 29 17:33:59
RT @SperglerAcolyte: Liberals believe in retarded infinite money glitches. Unreal.

RT @SperglerAcolyte: Liberals believe in retarded infinite money glitches. Unreal.

AI Optimist. Empiricist, not 'rationalist'. Anti world government.

avatar for renji
renji
Wed Oct 29 17:33:49
Making decisions with imperfect information at the frontier AI labs

Please follow @zpysky1125 - lead researcher Minimax AI - creators of M2, the current leading OSS model and first OSS interleaved thinking model to my knowledge.

The below blog by @zpysky1125 is a beautiful blog 💕if you are interested in what goes on in minds of people who train state of the art (SOTA) LLMs.

It discusses what kind of choices they are posed and how they make decision with imperfect information. The issue is you can not run too many experiments with LLM trainings, as each run is very expensive. This is unlike conventional ML.

Pengyu very honestly discusses why they had to discard, or rather put on the back bench, their earlier innovation of 'Linear attention' that they used for MiniMax M1 model, and go back to the 'Full attention' for M2.

They abandoned the technology tree they invented and had to discard with a heavy heart. They discuss it with great honesty. It is heartfelt. 

Pengyu discusses advantages of the proven path in the short run - even if it may be less efficient. They also discusses in what situations they will revisit decision on Linear Attention. You will get so much to learn!!!!

This is a rare insight into minds of decision makers at the frontier labs. Let us please have more of this sharing American labs.

<TL:DR> Pick your battles wisely.

Thanks @Hailuo_AI and Pengyu (@zpysky1125 )

@dwarkesh_sp, @himanshustwts please have Chinese researchers (from Chinese labs) on your podcast 🇨🇳🇺🇸💕.

Making decisions with imperfect information at the frontier AI labs Please follow @zpysky1125 - lead researcher Minimax AI - creators of M2, the current leading OSS model and first OSS interleaved thinking model to my knowledge. The below blog by @zpysky1125 is a beautiful blog 💕if you are interested in what goes on in minds of people who train state of the art (SOTA) LLMs. It discusses what kind of choices they are posed and how they make decision with imperfect information. The issue is you can not run too many experiments with LLM trainings, as each run is very expensive. This is unlike conventional ML. Pengyu very honestly discusses why they had to discard, or rather put on the back bench, their earlier innovation of 'Linear attention' that they used for MiniMax M1 model, and go back to the 'Full attention' for M2. They abandoned the technology tree they invented and had to discard with a heavy heart. They discuss it with great honesty. It is heartfelt. Pengyu discusses advantages of the proven path in the short run - even if it may be less efficient. They also discusses in what situations they will revisit decision on Linear Attention. You will get so much to learn!!!! This is a rare insight into minds of decision makers at the frontier labs. Let us please have more of this sharing American labs. <TL:DR> Pick your battles wisely. Thanks @Hailuo_AI and Pengyu (@zpysky1125 ) @dwarkesh_sp, @himanshustwts please have Chinese researchers (from Chinese labs) on your podcast 🇨🇳🇺🇸💕.

AI @amazon. Open Source AI enjoyer. GPU rich, but loves the GPU poor. Pied piper to AI agents. Hill climber with RL. All views personal!

avatar for GDP
GDP
Wed Oct 29 17:33:40
  • Previous
  • 1
  • More pages
  • 1933
  • 1934
  • 1935
  • More pages
  • 2118
  • Next