LogoThread Easy
  • 探索
  • 撰写 Thread
LogoThread Easy

您的一体化 Twitter 线程助手

© 2025 Thread Easy All Rights Reserved.

探索

最新在前,按卡片方式浏览线程

开启时会模糊预览图,关闭后正常显示

RT @SchmidhuberAI: @kimmonismus No. Wei Zhang et al. applied "modern" backprop-trained 2-dimensional CNNs to character recognition in 1988.…

RT @SchmidhuberAI: @kimmonismus No. Wei Zhang et al. applied "modern" backprop-trained 2-dimensional CNNs to character recognition in 1988.…

Invented principles of meta-learning (1987), GANs (1990), Transformers (1991), very deep learning (1991), etc. Our AI is used many billions of times every day.

avatar for Jürgen Schmidhuber
Jürgen Schmidhuber
Fri Nov 28 20:05:34
We might have to rewrite some characters and a scene or two. It’s clearly going to be much bloodier than I thought.

I’m certain assassination attempts on AI engineers & founders are coming.

We might have to rewrite some characters and a scene or two. It’s clearly going to be much bloodier than I thought. I’m certain assassination attempts on AI engineers & founders are coming.

Founder @oddtalesgames Directing The Last Night @TLN_Game Art Direction, Cinematography, Tech Art. Atoms, Bits, Memes, Genes. Freedom, Futurism, Humanism.

avatar for Tim Soret
Tim Soret
Fri Nov 28 20:04:55
Ilya 针对他访谈的内容发推做了澄清和补充:
> 访谈里有个点我没说清楚,补充一下:
> 继续按现在的方法scale——堆算力、堆数据、堆训练环境——肯定还会带来改进。不会停滞,会继续变好。
> 但是,总会缺点什么重要的东西。

这是在纠正一个可能的误解。访谈里他说了很多「回到研究时代」「当前方法会撞墙」的话,容易让人以为他在唱衰 scaling law,觉得继续堆算力、堆数据、堆 RL 训练会失效。

他说不是这个意思,当前路径会持续带来改进,不会停滞。模型会继续变强,benchmark 会继续涨,产品会继续迭代,公司会继续赚钱。

注意后面的“但是”

有些东西你怎么scale都得不到。

这就像你在练短跑。继续训练,成绩还会提高,从12秒提到11秒5,再到11秒,甚至10秒9。这是真实的进步。但如果你的目标是学会飞,那不管你跑多快都没用,那需要完全不同的能力。

缺的是什么?

结合访谈内容,这个"重要的缺失"指的应该是:

1. 真正的泛化能力
不是在海量数据训练后能做很多任务,而是能从很少的经验中快速学到新东西,并且学到的东西在新场景下也稳定可靠。

2. 高效学习
人类学开车10小时,学编程几个月就能工作。这种效率,不是靠预训练海量数据能获得的。

访谈里那个“两个学生”的类比很说明问题。刷一万小时题的学生确实能继续提高竞赛成绩,从前10%到前1%到冠军,这是真实进步。但他永远成不了那个只练100小时就显示出"悟性"的学生。

Ilya 针对他访谈的内容发推做了澄清和补充: > 访谈里有个点我没说清楚,补充一下: > 继续按现在的方法scale——堆算力、堆数据、堆训练环境——肯定还会带来改进。不会停滞,会继续变好。 > 但是,总会缺点什么重要的东西。 这是在纠正一个可能的误解。访谈里他说了很多「回到研究时代」「当前方法会撞墙」的话,容易让人以为他在唱衰 scaling law,觉得继续堆算力、堆数据、堆 RL 训练会失效。 他说不是这个意思,当前路径会持续带来改进,不会停滞。模型会继续变强,benchmark 会继续涨,产品会继续迭代,公司会继续赚钱。 注意后面的“但是” 有些东西你怎么scale都得不到。 这就像你在练短跑。继续训练,成绩还会提高,从12秒提到11秒5,再到11秒,甚至10秒9。这是真实的进步。但如果你的目标是学会飞,那不管你跑多快都没用,那需要完全不同的能力。 缺的是什么? 结合访谈内容,这个"重要的缺失"指的应该是: 1. 真正的泛化能力 不是在海量数据训练后能做很多任务,而是能从很少的经验中快速学到新东西,并且学到的东西在新场景下也稳定可靠。 2. 高效学习 人类学开车10小时,学编程几个月就能工作。这种效率,不是靠预训练海量数据能获得的。 访谈里那个“两个学生”的类比很说明问题。刷一万小时题的学生确实能继续提高竞赛成绩,从前10%到前1%到冠军,这是真实进步。但他永远成不了那个只练100小时就显示出"悟性"的学生。

Prompt Engineer, dedicated to learning and disseminating knowledge about AI, software engineering, and engineering management.

avatar for 宝玉
宝玉
Fri Nov 28 20:03:52
Conventional deep RL (and DL in general) is about learning by practice. Tiny but steady local improvements, which build up excellent reflexes.

This often gets stuck learning at the wrong level of abstraction. We learn by reflection and directed experimentation not just practice.

Conventional deep RL (and DL in general) is about learning by practice. Tiny but steady local improvements, which build up excellent reflexes. This often gets stuck learning at the wrong level of abstraction. We learn by reflection and directed experimentation not just practice.

Asst professor @MIT EECS & CSAIL (@nlp_mit). Author of https://t.co/VgyLxl0oa1 and https://t.co/ZZaSzaRaZ7 (@DSPyOSS). Prev: CS PhD @StanfordNLP. Research @Databricks.

avatar for Omar Khattab
Omar Khattab
Fri Nov 28 20:01:35
Want to get a weekly curated list of top GitHub repos and similar posts like this?  
Join our newsletter and get them straight to your inbox 👇

https://t.co/fIQKe7W5O3

Want to get a weekly curated list of top GitHub repos and similar posts like this? Join our newsletter and get them straight to your inbox 👇 https://t.co/fIQKe7W5O3

We're sharing/showcasing best of @github projects/repos. Follow to stay in loop. Promoting Open-Source Contributions. UNOFFICIAL, but followed by github

avatar for GitHub Projects Community
GitHub Projects Community
Fri Nov 28 20:01:24
Is it just me or does this do opposite of what it's intended?

Who is telling ppl to put this on their profiles?!

Is it just me or does this do opposite of what it's intended? Who is telling ppl to put this on their profiles?!

📈 Leading Growth @GroqInc 📌 Prev @a16z @HubSpot @TheHustle 💻 Chronically online: https://t.co/AkbwhoTr0K 📘 Wrote https://t.co/w1DBDrOZdI 🎙 Podcast at @sydlis 👇

avatar for Steph Smith
Steph Smith
Fri Nov 28 20:00:24
  • Previous
  • 1
  • More pages
  • 2089
  • 2090
  • 2091
  • More pages
  • 5634
  • Next