LogoThread Easy
  • Explorar
  • Criar thread
LogoThread Easy

Seu parceiro completo para threads do Twitter

© 2025 Thread Easy All Rights Reserved.

Explorar

Newest first — browse tweet threads

Keep on to blur preview images; turn off to show them clearly

Nothing embarrasses me more as a Canadian than how our entrepreneurs beg for government funding 

Raise from private markets! 

https://t.co/GVjrSt1IoE

Nothing embarrasses me more as a Canadian than how our entrepreneurs beg for government funding Raise from private markets! https://t.co/GVjrSt1IoE

modeling language at @allen_ai

avatar for finbarr
finbarr
Mon Dec 08 14:54:30
Remains the central nodes: For this one you need basically to have explored *all the branches* to know where to go, and until all this has been build, the gradient sees nothing.

It is exactly reasoning: Until you have explored complex options, you cannot know the good one.

5/5

Remains the central nodes: For this one you need basically to have explored *all the branches* to know where to go, and until all this has been build, the gradient sees nothing. It is exactly reasoning: Until you have explored complex options, you cannot know the good one. 5/5

Research Scientist @meta (FAIR), Prof. @Unige_en, co-founder @nc_shape. I like reality.

avatar for François Fleuret
François Fleuret
Mon Dec 08 14:54:26
Now, when you teach a gpt on that, every node (but the central) in the solution path is obvious to predict. Since it has only two neighbors at most, just check which they are, the one already in the path, and you are done. Gradient descent is super happy. Logits are great.

4/5

Now, when you teach a gpt on that, every node (but the central) in the solution path is obvious to predict. Since it has only two neighbors at most, just check which they are, the one already in the path, and you are done. Gradient descent is super happy. Logits are great. 4/5

Remains the central nodes: For this one you need basically to have explored *all the branches* to know where to go, and until all this has been build, the gradient sees nothing. It is exactly reasoning: Until you have explored complex options, you cannot know the good one. 5/5

avatar for François Fleuret
François Fleuret
Mon Dec 08 14:54:25
Had a chat at neurips with @_vaishnavh about the failure of next-token prediction + teacher forcing, and he has this wonderful minimal synthetic problem that IMO encompasses all the problem with / reason for "reasoning"

1/5

Had a chat at neurips with @_vaishnavh about the failure of next-token prediction + teacher forcing, and he has this wonderful minimal synthetic problem that IMO encompasses all the problem with / reason for "reasoning" 1/5

A sequence will be generated by first building a "star graph" with a central node and several "paths" extending from there. So the central node has as many neighbors as there are branches, the branche extremities have a single neighbor, and all other node has two. 2/5

avatar for François Fleuret
François Fleuret
Mon Dec 08 14:54:23
why settle for just liking fancy things when you can also like everything else this world has to offer at the same time

why settle for just liking fancy things when you can also like everything else this world has to offer at the same time

curious guy creating things @ https://t.co/HXWladhJaA - up and coming wife guy

avatar for jack friks
jack friks
Mon Dec 08 14:52:38
i would like to take this time to thank @marclou for making shipfast on an older version of next which means almost all my projects were marked safe from this vector of attack

we are lucky guys

i would like to take this time to thank @marclou for making shipfast on an older version of next which means almost all my projects were marked safe from this vector of attack we are lucky guys

curious guy creating things @ https://t.co/HXWladhJaA - up and coming wife guy

avatar for jack friks
jack friks
Mon Dec 08 14:48:21
  • Previous
  • 1
  • More pages
  • 1285
  • 1286
  • 1287
  • More pages
  • 5634
  • Next