the algo doesn't promote nuance as much unfortunately, so i give that here.
another major reason, almost all RL papers use veRL. veRL didn't have fp16 support lmao
正在加载线程详情
正在从 X 获取原始推文,整理成清爽的阅读视图。
通常只需几秒钟,请稍候。
共 3 条推文 · 2025年11月1日 04:39
the algo doesn't promote nuance as much unfortunately, so i give that here.
another major reason, almost all RL papers use veRL. veRL didn't have fp16 support lmao