LogoThread Easy
  • 탐색
  • 스레드 작성
LogoThread Easy

트위터 스레드의 올인원 파트너

© 2025 Thread Easy All Rights Reserved.

탐색

Newest first — browse tweet threads

Keep on to blur preview images; turn off to show them clearly

KL is how many words somebody has to whisper into your ear to fix your misconception about the world.

KL is how many words somebody has to whisper into your ear to fix your misconception about the world.

Research Scientist @meta (FAIR), Prof. @Unige_en, co-founder @nc_shape. I like reality.

avatar for François Fleuret
François Fleuret
Thu Nov 13 06:26:53
Conference reviews are the archetypal example of the confusion between the form and the essence of things.

The *only* question is: "would the collective body of knowledge benefit from this being presented at the conference"

Conference reviews are the archetypal example of the confusion between the form and the essence of things. The *only* question is: "would the collective body of knowledge benefit from this being presented at the conference"

Research Scientist @meta (FAIR), Prof. @Unige_en, co-founder @nc_shape. I like reality.

avatar for François Fleuret
François Fleuret
Thu Nov 13 06:20:11
You can't just tell your model "either you learn to write Shakespearean poetry and algebraic topology, or ... OR ... you mess up Q(Z|X) to make it dumb"

You can't just tell your model "either you learn to write Shakespearean poetry and algebraic topology, or ... OR ... you mess up Q(Z|X) to make it dumb"

Research Scientist @meta (FAIR), Prof. @Unige_en, co-founder @nc_shape. I like reality.

avatar for François Fleuret
François Fleuret
Thu Nov 13 05:34:46
Black is vanilla 1.5B 28 layers, blue is the free transformer from the arxiv paper, with a +3.5% overhead in compute and memory, red and orange are two variants of v2 with a +1.3% overhead and far simpler code.

Black is vanilla 1.5B 28 layers, blue is the free transformer from the arxiv paper, with a +3.5% overhead in compute and memory, red and orange are two variants of v2 with a +1.3% overhead and far simpler code.

Research Scientist @meta (FAIR), Prof. @Unige_en, co-founder @nc_shape. I like reality.

avatar for François Fleuret
François Fleuret
Wed Nov 12 19:02:45
Weirdest graph ever, but this thing is robust. The recovery on Human Eval + is spectacular.

Anway version +1 already running, we'll see.

Weirdest graph ever, but this thing is robust. The recovery on Human Eval + is spectacular. Anway version +1 already running, we'll see.

Research Scientist @meta (FAIR), Prof. @Unige_en, co-founder @nc_shape. I like reality.

avatar for François Fleuret
François Fleuret
Wed Nov 12 07:01:00
Today I decided to replace the KL penalty with some yolo crazy approach, which worked. When looking at it closely it is the standard KL penalty with a minor but very important change that assures a property I tried to obtain months ago without success. Today is a good day.

Today I decided to replace the KL penalty with some yolo crazy approach, which worked. When looking at it closely it is the standard KL penalty with a minor but very important change that assures a property I tried to obtain months ago without success. Today is a good day.

Research Scientist @meta (FAIR), Prof. @Unige_en, co-founder @nc_shape. I like reality.

avatar for François Fleuret
François Fleuret
Tue Nov 11 21:53:16
  • Previous
  • 1
  • 2
  • 3
  • More pages
  • 17
  • 18
  • Next