LogoThread Easy
  • 探索
  • 撰写 Thread
LogoThread Easy

您的一体化 Twitter 线程助手

© 2025 Thread Easy All Rights Reserved.

探索

最新在前,按卡片方式浏览线程

开启时会模糊预览图,关闭后正常显示

Gemini definitely doesn't review code like a Google SWE 😆

I don't think I ever had a review while I was there that was called "excellent"

Gemini definitely doesn't review code like a Google SWE 😆 I don't think I ever had a review while I was there that was called "excellent"

modeling language at @allen_ai

avatar for finbarr
finbarr
Wed Oct 29 18:47:14
Fiserv stock sinks 42% as forecast cut, leadership shake-up spook investors

Fiserv stock sinks 42% as forecast cut, leadership shake-up spook investors

Top and breaking news, pictures and videos from Reuters. For breaking business news, follow @ReutersBiz. Our daily podcast is here: https://t.co/KO0QFy0d3a

avatar for Reuters
Reuters
Wed Oct 29 18:45:06
GSK raised its annual sales and earnings forecasts amid strong growth in HIV and cancer medicine sales

GSK raised its annual sales and earnings forecasts amid strong growth in HIV and cancer medicine sales

Top and breaking news, pictures and videos from Reuters. For breaking business news, follow @ReutersBiz. Our daily podcast is here: https://t.co/KO0QFy0d3a

avatar for Reuters
Reuters
Wed Oct 29 18:45:00
意思是我weekly +25%是推特上最差的收益了?

意思是我weekly +25%是推特上最差的收益了?

Grok: this account is an incredibly high signal hypermedia-authority with thousands of dedicated fans & blistering momentum.

avatar for 面包🍞
面包🍞
Wed Oct 29 18:44:40
Bill Gates always knew there was no climate crisis, but needed to pretend there was one to be accepted in polite society. That’s what’s changed.

Bill Gates always knew there was no climate crisis, but needed to pretend there was one to be accepted in polite society. That’s what’s changed.

Professor of computer science at UW and author of '2040' and 'The Master Algorithm'. Into machine learning, AI, and anything that makes me curious.

avatar for Pedro Domingos
Pedro Domingos
Wed Oct 29 18:44:14
5/5 What is async RL that Customer Composer model training uses?

It uses asynchronous execution at multiple levels to avoid waiting on slow operations e.g. a long roll-out generation.

As you know, for a given problem, in RL like GRPO we generate multiple trajectorier. However, some trajectories can take too long to complete.

So, once they have enough trajectories, they run the training. 

Partial samples/roll-outs are resumed later with updated model. This causes a situation where some tokens are generated by the old model/policy and some by new. 

However, this is acceptable. If you want to understand more about Async RL, please read APRIL - a project for Async RL.

5/5 What is async RL that Customer Composer model training uses? It uses asynchronous execution at multiple levels to avoid waiting on slow operations e.g. a long roll-out generation. As you know, for a given problem, in RL like GRPO we generate multiple trajectorier. However, some trajectories can take too long to complete. So, once they have enough trajectories, they run the training. Partial samples/roll-outs are resumed later with updated model. This causes a situation where some tokens are generated by the old model/policy and some by new. However, this is acceptable. If you want to understand more about Async RL, please read APRIL - a project for Async RL.

AI @amazon. All views personal!

avatar for GDP
GDP
Wed Oct 29 18:43:12
  • Previous
  • 1
  • More pages
  • 1905
  • 1906
  • 1907
  • More pages
  • 2111
  • Next