LogoThread Easy
  • 探索
  • 線程創作
LogoThread Easy

Twitter 線程的一站式夥伴

© 2025 Thread Easy All Rights Reserved.

探索

Newest first — browse tweet threads

Keep on to blur preview images; turn off to show them clearly

The idea behind significantly improving the performance on hard real-world tasks is to train a value function, condition the model on advantages computed from the value function, and running an iterative improvement loop where the model learns from it’s own data.

The idea behind significantly improving the performance on hard real-world tasks is to train a value function, condition the model on advantages computed from the value function, and running an iterative improvement loop where the model learns from it’s own data.

Research Scientist @physical_int. Formerly Google DeepMind

avatar for Danny Driess
Danny Driess
Tue Nov 18 00:22:42
  • Previous
  • 1
  • Next