LogoThread Easy
  • 発見
  • スレッド作成
LogoThread Easy

Twitter スレッドの万能パートナー

© 2025 Thread Easy All Rights Reserved.

探索

Newest first — browse tweet threads

Keep on to blur preview images; turn off to show them clearly

Fascinating discussion with gork on the nature of “manufacturing” and how American manufacturers are 7x more productive. Basically: because EVERYTHING Americans are involved in is manufacturing. iPhones made in Shenzhen? “Upstream value”. Licensing Shanghai drugs? You guessed it!

Fascinating discussion with gork on the nature of “manufacturing” and how American manufacturers are 7x more productive. Basically: because EVERYTHING Americans are involved in is manufacturing. iPhones made in Shenzhen? “Upstream value”. Licensing Shanghai drugs? You guessed it!

We're in a race. It's not USA vs China but humans and AGIs vs ape power centralization. @deepseek_ai stan #1, 2023–Deep Time «C’est la guerre.» ®1

avatar for Teortaxes▶️ (DeepSeek 推特🐋铁粉 2023 – ∞)
Teortaxes▶️ (DeepSeek 推特🐋铁粉 2023 – ∞)
Sun Nov 09 08:05:56
i think this is beautiful - doing ViT from raw pixels means you need to jointly train everything - this poor model must independently solve MNIST, and THEN/ALSO learn to be a perfect calculator in its weights. then keep going.... only constrained by the data you give it.

it's why @percyliang's concept of "foundation models" in 2021 was so disruptive/sacrilegious in the Google vs OpenAI sprint to the GPT: instead of 1000 different small models all specialized in their tasks, concentrate all that budget/data/resources in one supermodel that has the capacity to model 1000 tasks; along the way you get 1) transfer learning, 2) capabilities you never explicitly trained for, 3) emergent abilities that only unlock at a given param/depth/data exposure rate.

i think this is beautiful - doing ViT from raw pixels means you need to jointly train everything - this poor model must independently solve MNIST, and THEN/ALSO learn to be a perfect calculator in its weights. then keep going.... only constrained by the data you give it. it's why @percyliang's concept of "foundation models" in 2021 was so disruptive/sacrilegious in the Google vs OpenAI sprint to the GPT: instead of 1000 different small models all specialized in their tasks, concentrate all that budget/data/resources in one supermodel that has the capacity to model 1000 tasks; along the way you get 1) transfer learning, 2) capabilities you never explicitly trained for, 3) emergent abilities that only unlock at a given param/depth/data exposure rate.

achieve ambition with intentionality, intensity, & integrity - @dxtipshq - @sveltesociety - @aidotengineer - @latentspacepod - @cognition + @smol_ai

avatar for swyx
swyx
Sun Nov 09 08:05:14
An open-source vibe coding platform that helps you build your own vibe-coding platform, built entirely on Cloudflare stack

An open-source vibe coding platform that helps you build your own vibe-coding platform, built entirely on Cloudflare stack

https://t.co/Hg7K1z0bcC

avatar for GitHub Projects Community
GitHub Projects Community
Sun Nov 09 08:00:06
发现boss上和v2ex上不定期有黑灰产招爬虫,这个Jd描述得特别定向。

譬如想爬拼多多、想爬微信公众号、想爬淘宝商品、淘宝评论、想爬抖音视频...

这些黑产统一特点是,给的工资不高(1万一个月),福利也隐晦不谈,要求倍儿高。

如真有傻的程序员信了这个邪,操着卖白粉的心拿着卖白菜的钱,回头还可能遇到跑路(欠钱不给)的情况。

这些发招聘信息的黑灰产真当人家公司安全团队是废材,回头锅会全甩到程序员头上,这种程序员工作真心面向监狱编程-_-。

发现boss上和v2ex上不定期有黑灰产招爬虫,这个Jd描述得特别定向。 譬如想爬拼多多、想爬微信公众号、想爬淘宝商品、淘宝评论、想爬抖音视频... 这些黑产统一特点是,给的工资不高(1万一个月),福利也隐晦不谈,要求倍儿高。 如真有傻的程序员信了这个邪,操着卖白粉的心拿着卖白菜的钱,回头还可能遇到跑路(欠钱不给)的情况。 这些发招聘信息的黑灰产真当人家公司安全团队是废材,回头锅会全甩到程序员头上,这种程序员工作真心面向监狱编程-_-。

从投资领域转到创业:找工作、找面试题、改简历、模拟面试. 创业(冷启动)|AI , AIGC | 安全技术|RAG | 时空智能 | 认知心理学|智能体 | 生命科学 | 强化学习 I built open source software at https://t.co/b69DXZhcyR

avatar for Y11
Y11
Sun Nov 09 07:58:24
The new paid plan is now available on https://t.co/nCmwcvlJIY!

Now, you can access professional product review blogs on the DR 74 site through this plan.

The new paid plan is now available on https://t.co/nCmwcvlJIY! Now, you can access professional product review blogs on the DR 74 site through this plan.

If you previously paid for the Turbo0 Pro plan, you can DM me to get a $69 discount code.

avatar for Justin3go
Justin3go
Sun Nov 09 07:54:28
It is more than what meets the eye :KIMI-2-Thinking QAT. It has to also do with supporting more/Chinese AI Chips.

Must read brilliant short blog below on why Kimi (Moonshot AI) chose QAT (quantisation aware training). Here is my read.

TL:DR: Not just that it reduces latency for inference and memory requirement in memory bound scenarios (which MoE with Kimi-2's sparsity scale is) and speeds up RL training by 10-20%; it also enables alternate hardware ecosystems like Cambricon and Ascend from Huawei due to INT4 format.

Quotes from the blog:
=================
1) Why INT4, not MXFP4?
Kimi chose INT4 over "fancier" MXFP4/NVFP4 to better support non-Blackwell GPUs, with strong existing kernel support (e.g., Marlin). 
2) Kimi-2-Thinking weights are 4 bit and activations are 16 bit (denoted as W4A16) 
They sate further W4A8 and even W4A4 are on the horizon. As new chips roll out with FP4-native operators, Kimi's quantization path will continue evolving....

INT4 support in made in China chips:
=================
Cambricon GPUs explicitly support INT4 quantization, including for AI inference workloads, as seen across several models such as the MLU270, MLU370-X8, and newer chips, as well as in recent open-source releases with INT4 integration for large models like GLM-4.6. 

Huawei Ascend NPUs also support INT4 quantization for inference, as confirmed by documentation and kernel releases related to GEMM and quantized model deployments.

It is more than what meets the eye :KIMI-2-Thinking QAT. It has to also do with supporting more/Chinese AI Chips. Must read brilliant short blog below on why Kimi (Moonshot AI) chose QAT (quantisation aware training). Here is my read. TL:DR: Not just that it reduces latency for inference and memory requirement in memory bound scenarios (which MoE with Kimi-2's sparsity scale is) and speeds up RL training by 10-20%; it also enables alternate hardware ecosystems like Cambricon and Ascend from Huawei due to INT4 format. Quotes from the blog: ================= 1) Why INT4, not MXFP4? Kimi chose INT4 over "fancier" MXFP4/NVFP4 to better support non-Blackwell GPUs, with strong existing kernel support (e.g., Marlin). 2) Kimi-2-Thinking weights are 4 bit and activations are 16 bit (denoted as W4A16) They sate further W4A8 and even W4A4 are on the horizon. As new chips roll out with FP4-native operators, Kimi's quantization path will continue evolving.... INT4 support in made in China chips: ================= Cambricon GPUs explicitly support INT4 quantization, including for AI inference workloads, as seen across several models such as the MLU270, MLU370-X8, and newer chips, as well as in recent open-source releases with INT4 integration for large models like GLM-4.6. Huawei Ascend NPUs also support INT4 quantization for inference, as confirmed by documentation and kernel releases related to GEMM and quantized model deployments.

AI @amazon. All views personal!

avatar for GDP
GDP
Sun Nov 09 07:50:46
  • Previous
  • 1
  • More pages
  • 338
  • 339
  • 340
  • More pages
  • 2111
  • Next