LogoThread Easy
  • 探索
  • 撰写 Thread
LogoThread Easy

您的一体化 Twitter 线程助手

© 2025 Thread Easy All Rights Reserved.

探索

最新在前,按卡片方式浏览线程

开启时会模糊预览图,关闭后正常显示

KL is how many words somebody has to whisper into your ear to fix your misconception about the world.

KL is how many words somebody has to whisper into your ear to fix your misconception about the world.

Research Scientist @meta (FAIR), Prof. @Unige_en, co-founder @nc_shape. I like reality.

avatar for François Fleuret
François Fleuret
Thu Nov 13 06:26:53
Conference reviews are the archetypal example of the confusion between the form and the essence of things.

The *only* question is: "would the collective body of knowledge benefit from this being presented at the conference"

Conference reviews are the archetypal example of the confusion between the form and the essence of things. The *only* question is: "would the collective body of knowledge benefit from this being presented at the conference"

Research Scientist @meta (FAIR), Prof. @Unige_en, co-founder @nc_shape. I like reality.

avatar for François Fleuret
François Fleuret
Thu Nov 13 06:20:11
You can't just tell your model "either you learn to write Shakespearean poetry and algebraic topology, or ... OR ... you mess up Q(Z|X) to make it dumb"

You can't just tell your model "either you learn to write Shakespearean poetry and algebraic topology, or ... OR ... you mess up Q(Z|X) to make it dumb"

Research Scientist @meta (FAIR), Prof. @Unige_en, co-founder @nc_shape. I like reality.

avatar for François Fleuret
François Fleuret
Thu Nov 13 05:34:46
Black is vanilla 1.5B 28 layers, blue is the free transformer from the arxiv paper, with a +3.5% overhead in compute and memory, red and orange are two variants of v2 with a +1.3% overhead and far simpler code.

Black is vanilla 1.5B 28 layers, blue is the free transformer from the arxiv paper, with a +3.5% overhead in compute and memory, red and orange are two variants of v2 with a +1.3% overhead and far simpler code.

Research Scientist @meta (FAIR), Prof. @Unige_en, co-founder @nc_shape. I like reality.

avatar for François Fleuret
François Fleuret
Wed Nov 12 19:02:45
Weirdest graph ever, but this thing is robust. The recovery on Human Eval + is spectacular.

Anway version +1 already running, we'll see.

Weirdest graph ever, but this thing is robust. The recovery on Human Eval + is spectacular. Anway version +1 already running, we'll see.

Research Scientist @meta (FAIR), Prof. @Unige_en, co-founder @nc_shape. I like reality.

avatar for François Fleuret
François Fleuret
Wed Nov 12 07:01:00
Today I decided to replace the KL penalty with some yolo crazy approach, which worked. When looking at it closely it is the standard KL penalty with a minor but very important change that assures a property I tried to obtain months ago without success. Today is a good day.

Today I decided to replace the KL penalty with some yolo crazy approach, which worked. When looking at it closely it is the standard KL penalty with a minor but very important change that assures a property I tried to obtain months ago without success. Today is a good day.

Research Scientist @meta (FAIR), Prof. @Unige_en, co-founder @nc_shape. I like reality.

avatar for François Fleuret
François Fleuret
Tue Nov 11 21:53:16
  • Previous
  • 1
  • 2
  • 3
  • More pages
  • 17
  • 18
  • Next