LogoThread Easy
  • Explorar
  • Criar thread
LogoThread Easy

Seu parceiro completo para threads do Twitter

© 2025 Thread Easy All Rights Reserved.

Explorar

Newest first — browse tweet threads

Keep on to blur preview images; turn off to show them clearly

cerebras 又出手啦! 推出了 DeepSeek-V3.2 剪枝版本

两个版本分别是 508B 和 345B, 分别体积减少了 25% 和 50%. 适合机器资源紧张的本地部署场景.

同样还是使用REAP剪枝方法, 可以智能的选择和移除冗余专家来压缩 MoE 模型. 不过说实话有点需要讨论的, 第一, cerebras 并没有放出更多测试结果, 只放出了HumanEval 和MBPP 测试, 这两个 DeepSeek 官方发布 v3.2  的时候并没有给出测试结果(也有可能是我没看到).

另外这两项测试 345B 的得分比 508B 要高? 所以建议想要投入大面积使用的朋友还是自己测试下这个剪枝模型的具体表现后再使用.

cerebras 又出手啦! 推出了 DeepSeek-V3.2 剪枝版本 两个版本分别是 508B 和 345B, 分别体积减少了 25% 和 50%. 适合机器资源紧张的本地部署场景. 同样还是使用REAP剪枝方法, 可以智能的选择和移除冗余专家来压缩 MoE 模型. 不过说实话有点需要讨论的, 第一, cerebras 并没有放出更多测试结果, 只放出了HumanEval 和MBPP 测试, 这两个 DeepSeek 官方发布 v3.2 的时候并没有给出测试结果(也有可能是我没看到). 另外这两项测试 345B 的得分比 508B 要高? 所以建议想要投入大面积使用的朋友还是自己测试下这个剪枝模型的具体表现后再使用.

A coder, road bike rider, server fortune teller, electronic waste collector, co-founder of KCORES, ex-director at IllaSoft, KingsoftOffice, Juejin.

avatar for karminski-牙医
karminski-牙医
Wed Dec 10 00:27:29
LMArena 公布了 2025年 Top 10 AI 研究机构, Google 力压群雄, 国内则是阿里位列第一, 然后是月之暗面, 智谱, DeepSeek, 百度. 

生态上来看, 阿里的确无解, 其他的只有一个或两个方面的模型能达到良好的表现. 而OpenAI 在数据上则是第四位.

LMArena 公布了 2025年 Top 10 AI 研究机构, Google 力压群雄, 国内则是阿里位列第一, 然后是月之暗面, 智谱, DeepSeek, 百度. 生态上来看, 阿里的确无解, 其他的只有一个或两个方面的模型能达到良好的表现. 而OpenAI 在数据上则是第四位.

A coder, road bike rider, server fortune teller, electronic waste collector, co-founder of KCORES, ex-director at IllaSoft, KingsoftOffice, Juejin.

avatar for karminski-牙医
karminski-牙医
Wed Dec 10 00:25:23
RT @marclou: Instead of adding more features, do this:

1. Launch everywhere (Hacker News, Product Hunt, Reddit, X, LinkedIn, YouTube etc.)…

RT @marclou: Instead of adding more features, do this: 1. Launch everywhere (Hacker News, Product Hunt, Reddit, X, LinkedIn, YouTube etc.)…

💻 https://t.co/Y30jsaHwz9 $30K/m ⚡️ https://t.co/vatLDmi9UG $21K/m 📈 https://t.co/3EDxln5mdi $17K/m ⭐️ https://t.co/MZc8tG9xWi $17K/m 🍜 https://t.co/r07EpGSYJ2 $1K/m 🧬 https://t.co/SfrVXVtmdA $0/m 🧾 https://t.co/7olaOzV8Xd $0/m +20 https://t.co/4zCWHGJp1S

avatar for Marc Lou
Marc Lou
Wed Dec 10 00:20:46
true. Understated, if anything. Chinese models are basically made of holes. In the inimitable Han fashion, they saved face and rebranded it as «fine-grained sparsity».

true. Understated, if anything. Chinese models are basically made of holes. In the inimitable Han fashion, they saved face and rebranded it as «fine-grained sparsity».

We're in a race. It's not USA vs China but humans and AGIs vs ape power centralization. @deepseek_ai stan #1, 2023–Deep Time «C’est la guerre.» ®1

avatar for Teortaxes▶️ (DeepSeek 推特🐋铁粉 2023 – ∞)
Teortaxes▶️ (DeepSeek 推特🐋铁粉 2023 – ∞)
Wed Dec 10 00:20:13
I understand this well because this «civilizational» thinking is very natural for Russians, and WWII was a similar experience in Russia. WWI was just another, bigger, bloody horror. Had we lost WWII, we would have ceased to exist as a people.
People have collective identities.

I understand this well because this «civilizational» thinking is very natural for Russians, and WWII was a similar experience in Russia. WWI was just another, bigger, bloody horror. Had we lost WWII, we would have ceased to exist as a people. People have collective identities.

We're in a race. It's not USA vs China but humans and AGIs vs ape power centralization. @deepseek_ai stan #1, 2023–Deep Time «C’est la guerre.» ®1

avatar for Teortaxes▶️ (DeepSeek 推特🐋铁粉 2023 – ∞)
Teortaxes▶️ (DeepSeek 推特🐋铁粉 2023 – ∞)
Wed Dec 10 00:14:28
Pretty astounding achievement to catch up on olympiad math competitiveness with frontier models like Poetiq from our partnership with @hillclimbai!

Cc @JeffDean @AnjneyMidha 😇

Pretty astounding achievement to catch up on olympiad math competitiveness with frontier models like Poetiq from our partnership with @hillclimbai! Cc @JeffDean @AnjneyMidha 😇

Cofounder and Head of Post Training @NousResearch, prev @StabilityAI Github: https://t.co/LZwHTUFwPq HuggingFace: https://t.co/sN2FFU8PVE

avatar for Teknium (e/λ)
Teknium (e/λ)
Wed Dec 10 00:12:51
  • Previous
  • 1
  • More pages
  • 1141
  • 1142
  • 1143
  • More pages
  • 5634
  • Next