LogoThread Easy
  • Explorer
  • Composer un thread
LogoThread Easy

Votre partenaire tout-en-un pour les threads Twitter

© 2025 Thread Easy All Rights Reserved.

Explorer

Newest first — browse tweet threads

Keep on to blur preview images; turn off to show them clearly

This Deep Research agent from China's Alibaba slipped under everyone's radar 🤯

It autonomously searches the web, plans multi-step reasoning, and synthesizes information like a researcher. Outperforms OpenAI, Gemini, and Kimi Deep Research.

100% open-source.

This Deep Research agent from China's Alibaba slipped under everyone's radar 🤯 It autonomously searches the web, plans multi-step reasoning, and synthesizes information like a researcher. Outperforms OpenAI, Gemini, and Kimi Deep Research. 100% open-source.

More such AI tools and projects in https://t.co/BvTc8nQQW5: Get access to 100+ AI Agent, RAG, LLM, and MCP tutorials with opensource code - All for FREE.

avatar for Unwind AI
Unwind AI
Mon Nov 03 03:15:03
Open AI 宫斗闹剧 Ilya 证词泄露,终于有当事人的描述了!

PPT 真是好用啊,Ilya 搞 Sam 也用的 PPT,我整理了所有的要点👇:

· Ilya 承认他一直在等待董事会大部分成员与 Sam 关系恶化,以使其能够提议罢免 Sam

· Ilya 考虑罢免 Sam 最少有一年多的时间,但没有长期计划

· Ilya 准备了 52 页 PPT 发送给了董事会的独立董事

· 用了阅后即焚方式发送 ppt,因为担心被人泄露给 Sam

· 也起草了一份批评 Greg 的 PPT 发给了董事会

· PPT 对 Sam 的核心指控是“SAM 表现出持续撒谎、破坏他的高管、以及挑拨他的高管之间关系的一贯模式”

· PPT 中的大部分截图都来自于 Mira

· PPT 里面一个 Sam 挑拨离间的核心例子是 MIRA 告诉他 SAM 挑拨了 DANIELA 与 MIRA 之间的关系

· 另一个例子是SAM 撒谎、破坏 MIRA 和 ILYA 的关系,以及挑拨 JAKUB 对抗 ILYA

· 里面还有 Anthropic CEO 的事情,当时还没离开 Open AI 的时候 DARIO 想要运营 OpenAI 的所有研究部门,并要求解雇 GREG

· Ilya 觉得 Sam 当时根本就不应该考虑他的条件应该直接拒绝,但是显然 Sam 还想过接受,如果接受了那可能就没有现在的 Anthropic 了

从内容来看 Ilya 由于一直不认同 Sam 的管理方式,但是没有详细计划,完全依赖 Mira 的二手信息,被董事会推一把就上了,然后直接承担了最差的后果,看来顶尖科学家的脑子也确实不适合宫斗。

内容来自加州北区地方法院奥克兰分部审理相关案件时 Ilya 的一个小时录像泄露的证词

Open AI 宫斗闹剧 Ilya 证词泄露,终于有当事人的描述了! PPT 真是好用啊,Ilya 搞 Sam 也用的 PPT,我整理了所有的要点👇: · Ilya 承认他一直在等待董事会大部分成员与 Sam 关系恶化,以使其能够提议罢免 Sam · Ilya 考虑罢免 Sam 最少有一年多的时间,但没有长期计划 · Ilya 准备了 52 页 PPT 发送给了董事会的独立董事 · 用了阅后即焚方式发送 ppt,因为担心被人泄露给 Sam · 也起草了一份批评 Greg 的 PPT 发给了董事会 · PPT 对 Sam 的核心指控是“SAM 表现出持续撒谎、破坏他的高管、以及挑拨他的高管之间关系的一贯模式” · PPT 中的大部分截图都来自于 Mira · PPT 里面一个 Sam 挑拨离间的核心例子是 MIRA 告诉他 SAM 挑拨了 DANIELA 与 MIRA 之间的关系 · 另一个例子是SAM 撒谎、破坏 MIRA 和 ILYA 的关系,以及挑拨 JAKUB 对抗 ILYA · 里面还有 Anthropic CEO 的事情,当时还没离开 Open AI 的时候 DARIO 想要运营 OpenAI 的所有研究部门,并要求解雇 GREG · Ilya 觉得 Sam 当时根本就不应该考虑他的条件应该直接拒绝,但是显然 Sam 还想过接受,如果接受了那可能就没有现在的 Anthropic 了 从内容来看 Ilya 由于一直不认同 Sam 的管理方式,但是没有详细计划,完全依赖 Mira 的二手信息,被董事会推一把就上了,然后直接承担了最差的后果,看来顶尖科学家的脑子也确实不适合宫斗。 内容来自加州北区地方法院奥克兰分部审理相关案件时 Ilya 的一个小时录像泄露的证词

看来顶尖科学家脑子都用在研究上了,搞宫斗和搞办公室政治确实不太行,昏招频出

avatar for 歸藏(guizang.ai)
歸藏(guizang.ai)
Mon Nov 03 03:14:00
骂我的人是不是都把我拉黑了, 我如果看到这条推文, 我大概率开喷了。 
都啥年代了 ,还开语言炮。

骂我的人是不是都把我拉黑了, 我如果看到这条推文, 我大概率开喷了。 都啥年代了 ,还开语言炮。

Growth Coach|Helping creators build their personal brand on X wechat official:PandaTalk8

avatar for Mr Panda
Mr Panda
Mon Nov 03 03:13:17
everyone is much more nervous about capitalizing on my recent stroke of luck than i am and i think it’s funny… i have gotten lucky before and in the end all things that happen fast come down equally fast. i am now enjoying it, and have a plan, and not worrying all day.

everyone is much more nervous about capitalizing on my recent stroke of luck than i am and i think it’s funny… i have gotten lucky before and in the end all things that happen fast come down equally fast. i am now enjoying it, and have a plan, and not worrying all day.

curious guy creating things @ https://t.co/HXWladhJaA - up and coming wife guy

avatar for jack friks
jack friks
Mon Nov 03 03:10:51
RT @aidanshandle: I built an art instillation for a halloween party this year!

Meet Aura Leo: a 1920s lion sculpture reborn for Halloween.…

RT @aidanshandle: I built an art instillation for a halloween party this year! Meet Aura Leo: a 1920s lion sculpture reborn for Halloween.…

investing @a16z // curating https://t.co/ssslqn6eo7

avatar for Ryan McEntush
Ryan McEntush
Mon Nov 03 03:09:42
[On using Continuous Latent Space Vectors in the context windows of  Transformers and LLMs] #SundayHarangue 

There is a lot of chatter about how vectors from continuous latent space can make transformers solve problems efficiently. Some of these arguments run counter to conservation of computational complexity, IMHO. 

The arguments/analogies revolve around viewing these tokens as "superposition" (think union) of discrete tokens.

As background, transformers operate in a latent space L s.t. every (linguistic) token corresponds to a vector in L.  This mapping is however one sided: not every vector in L corresponds to a unique  token. 

You could however see these vectors (that don't have unique token mapping) as a linear combination of token-corresponding vectors. In this way, they can be seen as a union/superposition of those tokens. 

It should be rather obvious that the operations of the transformer see entities in the the context window as just vectors from the  embedding space. In particular, the forward pass operation doesn't really care whether the vectors being processed have unique tokens corresponding to them or not. 

This means as far as the transformer operation is concerned, the the context window can have both "token vectors" (i.e., embedding vectors that correspond to unique tokens) and "latent vectors" (i.e., embedding vectors that don't correspond to unique tokens). As mentioned above, these latent vectors can be seen as linear combinations of the token vectors. 

One obvious use of this flexibility is that the intermediate tokens emitted by the transformer can well be these latent vectors; only the solution tokens (that are being passed onto the end users) need to be token vectors. Indeed, as we argue in https://t.co/f6E3c2j4dm (https://t.co/t4uYw5WTmD), as long as intermediate tokens don't seem to have any end-user semantics anyway, allowing for them to be any vector from latent space provides significantly more flexibility for learning appropriate prompt augmentations (c.f. https://t.co/jl0LyWJUys). 

Another argument that has been made about the use of latent vectors in the intermediate tokens is as a way to "improve efficiency of solving the underlying problems." 

Now, I am pretty skeptical about viewing LLMs as solving problems. Our work shows, for example, that there is little connection between the length of the intermediate tokens and the underlying complexity of the problem (c.f. https://t.co/UKgCwgHKeQ), suggesting that it is more indicative of attempts to bridge the training distribution and the test instance. 

Nevertheless, if we are into looking at transformers as ways of "computing solutions" (even if that is not what is actually happening in pre-trained LLMs), then letting transformers operate on latent vectors vs. token vectors seems to correspond to doing computation on disjunctive representations of entities rather than on single entities. 

Now, operating on disjunctive representations can improve average case efficiency over specific distributions, but not the worst case complexity. As a sanity test, abstraction and hierarchy can be viewed as operating on disjunctive representations, and neither change the worst case computational complexity of the problem; see https://t.co/aXreC5YKPN or https://t.co/UDzu2Qp7WK for arguments on planning. 

This is why, I am skeptical of claims that transformers with latent tokens can provably increase efficiency in all cases. For example, a recent paper https://t.co/4oQzEUIFPk argues that transformers with latent tokens can solve graph reachability in time proportional to the diameter of the graph (and throws in some citations to quantum superposition to boot!). This doesn't make sense--certainly not in the worst case--without violating conservation of complexity (or changing what it means to "solve" reachability; the paper's empirical results seem to be happy with less than 100% accuracy, for example). 

When we were discussing this paper in our group meeting on Friday, I told my students about the analogy with Graphplan planning algorithm--which speeds up STRIPS planning (which is closely connected to reachability). Many years back, we showed that Graphplan's speedups can be understood in terms of doing projection over sets of states rather than individual states. However, if you operate directly over union representations, you can get to a point where the representation might look like it is reaching the goal state, but it may not be possible to actually extract a valid path! (In the case of Graphplan, this extraction involves a decoding step that is exponential in cost, and if it fails, the projection over disjunctive states continues). This is illustrated in the figure below 👇and the original paper at https://t.co/s20cFEOfQk (or Figure 3 and the accompanying discussion in https://t.co/YqN0fh7vp6). 

tldr; I do believe that latent tokens can considerably increase the flexibility of prompt augmentations that LLMs can learn in post-training, but don't quite agree with the "they reduce the complexity of the problems under consideration".

[On using Continuous Latent Space Vectors in the context windows of Transformers and LLMs] #SundayHarangue There is a lot of chatter about how vectors from continuous latent space can make transformers solve problems efficiently. Some of these arguments run counter to conservation of computational complexity, IMHO. The arguments/analogies revolve around viewing these tokens as "superposition" (think union) of discrete tokens. As background, transformers operate in a latent space L s.t. every (linguistic) token corresponds to a vector in L. This mapping is however one sided: not every vector in L corresponds to a unique token. You could however see these vectors (that don't have unique token mapping) as a linear combination of token-corresponding vectors. In this way, they can be seen as a union/superposition of those tokens. It should be rather obvious that the operations of the transformer see entities in the the context window as just vectors from the embedding space. In particular, the forward pass operation doesn't really care whether the vectors being processed have unique tokens corresponding to them or not. This means as far as the transformer operation is concerned, the the context window can have both "token vectors" (i.e., embedding vectors that correspond to unique tokens) and "latent vectors" (i.e., embedding vectors that don't correspond to unique tokens). As mentioned above, these latent vectors can be seen as linear combinations of the token vectors. One obvious use of this flexibility is that the intermediate tokens emitted by the transformer can well be these latent vectors; only the solution tokens (that are being passed onto the end users) need to be token vectors. Indeed, as we argue in https://t.co/f6E3c2j4dm (https://t.co/t4uYw5WTmD), as long as intermediate tokens don't seem to have any end-user semantics anyway, allowing for them to be any vector from latent space provides significantly more flexibility for learning appropriate prompt augmentations (c.f. https://t.co/jl0LyWJUys). Another argument that has been made about the use of latent vectors in the intermediate tokens is as a way to "improve efficiency of solving the underlying problems." Now, I am pretty skeptical about viewing LLMs as solving problems. Our work shows, for example, that there is little connection between the length of the intermediate tokens and the underlying complexity of the problem (c.f. https://t.co/UKgCwgHKeQ), suggesting that it is more indicative of attempts to bridge the training distribution and the test instance. Nevertheless, if we are into looking at transformers as ways of "computing solutions" (even if that is not what is actually happening in pre-trained LLMs), then letting transformers operate on latent vectors vs. token vectors seems to correspond to doing computation on disjunctive representations of entities rather than on single entities. Now, operating on disjunctive representations can improve average case efficiency over specific distributions, but not the worst case complexity. As a sanity test, abstraction and hierarchy can be viewed as operating on disjunctive representations, and neither change the worst case computational complexity of the problem; see https://t.co/aXreC5YKPN or https://t.co/UDzu2Qp7WK for arguments on planning. This is why, I am skeptical of claims that transformers with latent tokens can provably increase efficiency in all cases. For example, a recent paper https://t.co/4oQzEUIFPk argues that transformers with latent tokens can solve graph reachability in time proportional to the diameter of the graph (and throws in some citations to quantum superposition to boot!). This doesn't make sense--certainly not in the worst case--without violating conservation of complexity (or changing what it means to "solve" reachability; the paper's empirical results seem to be happy with less than 100% accuracy, for example). When we were discussing this paper in our group meeting on Friday, I told my students about the analogy with Graphplan planning algorithm--which speeds up STRIPS planning (which is closely connected to reachability). Many years back, we showed that Graphplan's speedups can be understood in terms of doing projection over sets of states rather than individual states. However, if you operate directly over union representations, you can get to a point where the representation might look like it is reaching the goal state, but it may not be possible to actually extract a valid path! (In the case of Graphplan, this extraction involves a decoding step that is exponential in cost, and if it fails, the projection over disjunctive states continues). This is illustrated in the figure below 👇and the original paper at https://t.co/s20cFEOfQk (or Figure 3 and the accompanying discussion in https://t.co/YqN0fh7vp6). tldr; I do believe that latent tokens can considerably increase the flexibility of prompt augmentations that LLMs can learn in post-training, but don't quite agree with the "they reduce the complexity of the problems under consideration".

AI researcher & teacher @SCAI_ASU. Former President of @RealAAAI; Chair of @AAAS Sec T. Here to tweach #AI. YouTube Ch: https://t.co/4beUPOmf6y Bsky: rao2z

avatar for Subbarao Kambhampati (కంభంపాటి సుబ్బారావు)
Subbarao Kambhampati (కంభంపాటి సుబ్బారావు)
Mon Nov 03 03:09:19
  • Previous
  • 1
  • More pages
  • 1260
  • 1261
  • 1262
  • More pages
  • 2117
  • Next