LogoThread Easy
  • 発見
  • スレッド作成
LogoThread Easy

Twitter スレッドの万能パートナー

© 2025 Thread Easy All Rights Reserved.

探索

Newest first — browse tweet threads

Keep on to blur preview images; turn off to show them clearly

无语,折腾半天,cloudflare 现在默认不能创建 pages,只能创建 worker。找了半天才发现在创建界面下面这行小字才能进入 pages 创建的页面。其它按钮都只创建 worker。cloudflare 一直在弱化 pages,很快就要完全移除了。

无语,折腾半天,cloudflare 现在默认不能创建 pages,只能创建 worker。找了半天才发现在创建界面下面这行小字才能进入 pages 创建的页面。其它按钮都只创建 worker。cloudflare 一直在弱化 pages,很快就要完全移除了。

脚踏实地做梦/独立开发/降临派/新手奶爸👨‍🍼

avatar for KIWI
KIWI
Thu Nov 27 15:34:43
Compare and contrast:

P.W. Anderson, *More is Different* (1972)
https://t.co/hUBNV9Cv6u 

Edna Ullmann-Margalit, *Invsible Hand Explanations* (1978)
https://t.co/FRXVm4KFys

The arguments look structurally almost identical to me.

Does this not strike you as evidence that free will is an illusion?

Compare and contrast: P.W. Anderson, *More is Different* (1972) https://t.co/hUBNV9Cv6u Edna Ullmann-Margalit, *Invsible Hand Explanations* (1978) https://t.co/FRXVm4KFys The arguments look structurally almost identical to me. Does this not strike you as evidence that free will is an illusion?

Wonderer. Amor fati. Scaling trust.

avatar for Michael Frank Martin
Michael Frank Martin
Thu Nov 27 15:32:33
RT @LingYang_PU: Thank @_akhaliq for introducing our efficient latent-based multi-agent systems (LatentMAS).

Paper: https://t.co/yH9RFRmy7…

RT @LingYang_PU: Thank @_akhaliq for introducing our efficient latent-based multi-agent systems (LatentMAS). Paper: https://t.co/yH9RFRmy7…

AI research paper tweets, ML @Gradio (acq. by @HuggingFace 🤗) dm for promo ,submit papers here: https://t.co/UzmYN5XOCi

avatar for AK
AK
Thu Nov 27 15:31:12
So DeepSeek-Math-V2.

It could be subtitled: "how to train better verifiers?" and the bulk of it is simply… better data work and synth pipelines (even if all models are trained with RL).

DeepSeek further distances itself from the initial promises of spontaneous self-verification held by R0, simply because the approach isn't scalable: tortuous reasoning finally yielding correct answers is still very brittle and prone to failue.

The project start with human annotation, except it's high-level expert ones and representing in itself a wider industry shift where we try to scale up/automate the absolute best data quality process we can find. Here this process leverages also something we noticed while building the math pipeline for SYNTH: humans (and properly guided models) can identify instances of tortured reasoning without any reference to the final answers. 

The paper also mention a technique likely to become highly used in synthetic pipelines: "meta-verifiers", basically assessing the assessment process itself. Because even the verifier can get reward hacked: "when evaluating flawed proofs (where 𝑠𝑖 < 1) during training, the verifier can receive full reward by predicting the correct scores while hallucinating non-existent issues"

Human annotations are first done in synthetic drafts, then in turn serve to build evaluators which recursively produce better proofs and increasingly better solving paths. Overall, the process create a positive feedback loops: "The proof verifier and generator create a synergistic cycle: the verifier improves the generator, and as the generator improves, it produces new proofs that challenge the verifier’s current capabilities."

All training of verifiers/meta-verifiers/final model is done with RL (which makes sense for very large models as SFT/midtrain can get quite destructive). Yet, even then, the increasing complexity of RLVR which cannot be limited to a simple formal "verification" calls for the development of integrated, increasingly self-sufficient synthetic pipelines. 

Once more, math provers bring LLM research to the actual frontier and led to creative and elegant solution that are likely to irrigate the entire field in the months to come.

So DeepSeek-Math-V2. It could be subtitled: "how to train better verifiers?" and the bulk of it is simply… better data work and synth pipelines (even if all models are trained with RL). DeepSeek further distances itself from the initial promises of spontaneous self-verification held by R0, simply because the approach isn't scalable: tortuous reasoning finally yielding correct answers is still very brittle and prone to failue. The project start with human annotation, except it's high-level expert ones and representing in itself a wider industry shift where we try to scale up/automate the absolute best data quality process we can find. Here this process leverages also something we noticed while building the math pipeline for SYNTH: humans (and properly guided models) can identify instances of tortured reasoning without any reference to the final answers. The paper also mention a technique likely to become highly used in synthetic pipelines: "meta-verifiers", basically assessing the assessment process itself. Because even the verifier can get reward hacked: "when evaluating flawed proofs (where 𝑠𝑖 < 1) during training, the verifier can receive full reward by predicting the correct scores while hallucinating non-existent issues" Human annotations are first done in synthetic drafts, then in turn serve to build evaluators which recursively produce better proofs and increasingly better solving paths. Overall, the process create a positive feedback loops: "The proof verifier and generator create a synergistic cycle: the verifier improves the generator, and as the generator improves, it produces new proofs that challenge the verifier’s current capabilities." All training of verifiers/meta-verifiers/final model is done with RL (which makes sense for very large models as SFT/midtrain can get quite destructive). Yet, even then, the increasing complexity of RLVR which cannot be limited to a simple formal "verification" calls for the development of integrated, increasingly self-sufficient synthetic pipelines. Once more, math provers bring LLM research to the actual frontier and led to creative and elegant solution that are likely to irrigate the entire field in the months to come.

Artisanal baker of reasoning models @pleiasfr

avatar for Alexander Doria
Alexander Doria
Thu Nov 27 15:27:41
An imagined conversation between Edna Ulmann-Margalit and Thomas Schelling. 

With Charlie Munger summing things up at the end.

https://t.co/qW13PnOMWb

An imagined conversation between Edna Ulmann-Margalit and Thomas Schelling. With Charlie Munger summing things up at the end. https://t.co/qW13PnOMWb

Wonderer. Amor fati. Scaling trust.

avatar for Michael Frank Martin
Michael Frank Martin
Thu Nov 27 15:26:32
if you cut down your scope to just performance tuning in software engineering then you'd observe that opus 4.5 "gets it".

if you cut down your scope to just performance tuning in software engineering then you'd observe that opus 4.5 "gets it".

it can be a coworking intern better than most humans for me, i think i need to assemble a team.

avatar for tokenbender
tokenbender
Thu Nov 27 15:19:38
  • Previous
  • 1
  • More pages
  • 2202
  • 2203
  • 2204
  • More pages
  • 5634
  • Next