开启时会模糊预览图,关闭后正常显示

hypercurious :) founder @ness_labs • neuroscientist @KingsIoPPN • author of Tiny Experiments • personal science, systematic curiosity, experimental thinking ꩜⋆✦


Founder | Author | Speaker Building @beltstripe I'm Not The Man Of Your Dreams. Your Imagination Wasn't This Great.


Opening portals to handheld VR at https://t.co/A2JMItorCV. Problems soluble, potential to improve invariant.


5/5 What is async RL that Customer Composer model training uses? It uses asynchronous execution at multiple levels to avoid waiting on slow operations e.g. a long roll-out generation. As you know, for a given problem, in RL like GRPO we generate multiple trajectorier. However, some trajectories can take too long to complete. So, once they have enough trajectories, they run the training. Partial samples/roll-outs are resumed later with updated model. This causes a situation where some tokens are generated by the old model/policy and some by new. However, this is acceptable. If you want to understand more about Async RL, please read APRIL - a project for Async RL.


back in the prehistory of 3d computer vision (2016) we would use probabilistic / ebm models to fit shape templates to street scenes


Built Tweet Hunter, Taplio (sold $8m) Growing https://t.co/OyNJ8ZUyOh - https://t.co/jS9GQJ5Ps8 - https://t.co/EFUcKeBbpU - https://t.co/JkVOl1O0S1 - https://t.co/KG9PgxJabg Sharing weekly tips about growth: https://t.co/ereQodN3Ov
