Explorar

Funnily, Pleias first LLM compute plan was pretty close: a 30-50b range hybrid mamba mixture of expert trained on synth (back from January 2024, unfortunately never funded).

Artisanal baker of reasoning models @pleiasfr

Alexander Doria

Mon Dec 15 14:45:05

Great new open everything model release from Nvidia with actual experimentations on model design side (and further my feeling that dealing with synth reasoning creates new incentives on this front).

Funnily, Pleias first LLM compute plan was pretty close: a 30-50b range hybrid mamba mixture of expert trained on synth (back from January 2024, unfortunately never funded).

Alexander Doria

Mon Dec 15 14:43:31

There are euros flowing to China, but Chinese are not building a huge swimming-pool-shaped vault full of gold. What are they doing with that money? If they buy lands, vineyard or companies, it appears in the trade balance? Where is the money going?

Research Scientist @meta (FAIR), Prof. @Unige_en, co-founder @nc_shape. I like reality.

François Fleuret

Mon Dec 15 14:42:39

I’ve been staring at our favorite chart lately and wondering if we can make a bottoms up case for making the expensive stuff cheaper (healthcare + education). And I don’t mean just sub-inflation growth in prices - I mean decreasing prices. Anyways I tortured the gpt-5.2-pro model a bit with a few assumptions: - widespread GLP1 usage’s direct and indirect impact on healthcare spend - increasing student / administrator ratios in education - AI driving significant gains in productivity for white collar work, with dramatic gains in areas that are purely administrative (phone calls, document processing, RCM etc) And a constraint: - assume all the market forces assumptions play out (ai / glp 1s etc) but the market structure changes won’t happen So here is what you have to believe to get the very exciting new chart; the goals are 1.6%/yr price decline in healthcare and 1.8%/yr in education, compounded: Healthcare gates 1.Admin automation must be deep: on the order of ~40%+ reduction in the effective admin-cost bucket (which is plausibly 25–40%+ of hospital cost). 2.GLP‑1–driven weight loss produces meaningful reductions in downstream spending (e.g., studies associate 10–15% BMI reductions with ~15–22% lower annual spending in relevant cohorts), and adoption is high amongst the costly populations. 3.Pass-through happens despite no site-neutral reform: i.e., competition/payer pressure forces these cost declines into lower negotiated prices rather than purely higher margins. Education gates 1.The “support/admin” share (often ~35–50% in the Delta Cost function categories) must be cut by roughly 40–50% per student. 2.Instruction must see single‑digit to low‑double‑digit productivity gains without torpedoing outcomes. 3.Savings must translate to lower tuition, not just expanded services or cross-subsidies (this is where enrollment pressure and alternative credentials matter). I’ll let you all decide how achievable you think this stuff is, but nothing seems outlandish … Here is the ChatGPT conversation if you’re curious: https://t.co/yu8GmVPoAU Here is the result if we do manage to stick the landing:

AI Apps investing @ A16Z; A1111; Boards of Krea, Deel, Clutch, Titan, Arc Boats, Untitled, Happy Robot + more; If you’re not at the table, you’re on the menu

Anish Acharya

Mon Dec 15 14:37:04

In collaboration with NVIDIA, the new Nemotron 3 Nano model is fully supported in llama.cpp Nemotron 3 Nano features an efficient hybrid, Mamba, MoE architecture. It's a promising model, suitable for local AI applications on mid-range hardware. The large context window makes it a great choice for a variety of use cases and applications. The efficiency of llama.cpp and the unique context management features of the `llama-server` tool allows us to deploy and use this model on a wide-range of hardware. With recent code contributions by engineering teams at NVIDIA and open-source collaborators, we can run this model very efficiently across the entire spectrum of NVIDIA GPUs. Learn more at @NVIDIA_AI_PC https://t.co/3c9LRmfmRp

24th at the Electrica puzzle challenge | https://t.co/baTQS2bdia

Georgi Gerganov

Mon Dec 15 14:33:41

说起记忆这件事情可真的是相当复杂我目前个人理解可以分为几大部分 1. 工作记忆和长期记忆分离小容量窗口构建工作记忆上下文大容量外部加载长期记忆 2.embedding相关性检索和提取 3.对记忆的压缩与抽象压缩是折损的，更好的方式是抽象出pattern，这种表征可能更多的是一种约束满足，抽象后就能应对更多复杂的问题 4.选择性存储与主动遗忘要评估什么是值得长期存储的信息，信息在什么时候要遗忘，持续清理为了高效记忆 5.记忆的元数据人的记忆除了内容本身，还有很多其他特征，比如情绪，时空 agent的记忆也应该维护这些信息，以便更好检索 6.记忆重组与整合对积累的记忆进行离线重组和整合，建立不同记忆之间的相似关联对于记忆这件事，大家都在持续优化殊途同归但细节工程真的太多了

Believing is seeing

Yangyi

Mon Dec 15 14:33:38

Newest first — browse tweet threads

Explorar

Newest first — browse tweet threads

Funnily, Pleias first LLM compute plan was pretty close: a 30-50b range hybrid mamba mixture of expert trained on synth (back from January 2024, unfortunately never funded).

Great new open everything model release from Nvidia with actual experimentations on model design side (and further my feeling that dealing with synth reasoning creates new incentives on this front).

There are euros flowing to China, but Chinese are not building a huge swimming-pool-shaped vault full of gold. What are they doing with that money? If they buy lands, vineyard or companies, it appears in the trade balance? Where is the money going?