探索
撰写 Thread

探索

最新在前，按卡片方式浏览线程

作者账号

起始日期

结束日期

模糊预览图

开启时会模糊预览图，关闭后正常显示

RT @DanielleFong: you can get the shrouded aluminum cnc axial fan 0.25mm gap from alibaba for $200 in a cyber monday sale. they are pictur…

Joscha Bach

Tue Dec 02 18:34:13

Meta presents TUNA Taming Unified Visual Representations for Native Unified Multimodal Models

discuss: https://t.co/DmiesdBELu

Tue Dec 02 18:31:57

V3 base was a pivotal moment for the industry. Still crazy how even well-informed people failed to appreciate the research velocity it implied. «Oh, cute! innovation under constraints! We'll do MoEs and efficient attention too, heh». No bruh, take it seriously.

two years ago

Teortaxes▶️ (DeepSeek 推特🐋铁粉 2023 – ∞)

Tue Dec 02 18:31:41

> I am quite curious what the score would have looked like if the model had produced outputs for every sample without exceeding the maximum output token limit. They really need to reduce reasoning verbosity, and/or extend context to 256K+. DSA makes that economical, in theory.

We're in a race. It's not USA vs China but humans and AGIs vs ape power centralization. @deepseek_ai stan #1, 2023–Deep Time «C’est la guerre.» ®1

Teortaxes▶️ (DeepSeek 推特🐋铁粉 2023 – ∞)

Tue Dec 02 18:27:01

Axiom sets out to build an AI mathematician. We are the underdog. 4 months old, 2 years late to the game, under 10 FTEs (recently grew to 17), and had 1:5 in funding and in valuation to our competitor. Today, AxiomProver solved Erdos Problems #124 and #481 in Lean, a 100% verifiable language. Onwards!

@axiommathai : careers@axiommath.ai

Carina Hong

Tue Dec 02 18:23:09

RT @proxy_vector: @arvidkahl learned this the hard way early on had a trial user with a gmail account asking super basic questions. almost…

Building https://t.co/od97B0HVrk and https://t.co/666FnyVVE0 in Public. Raising all the boats with kindness. 🎙️ https://t.co/6w69DZmi8H · ✍️ https://t.co/lpnor5rsTW

Arvid Kahl

Tue Dec 02 18:22:41

Previous
1
1785
1786
1787
5634
Next

探索

最新在前，按卡片方式浏览线程

探索

最新在前，按卡片方式浏览线程

RT @DanielleFong: you can get the shrouded aluminum cnc axial fan 0.25mm gap from alibaba for $200 in a cyber monday sale. they are pictur…

Meta presents TUNA Taming Unified Visual Representations for Native Unified Multimodal Models

V3 base was a pivotal moment for the industry. Still crazy how even well-informed people failed to appreciate the *research velocity* it implied. «Oh, cute! innovation under constraints! We'll do MoEs and efficient attention too, heh». No bruh, take it seriously.

> I am quite curious what the score would have looked like if the model had produced outputs for every sample without exceeding the maximum output token limit. They really need to reduce reasoning verbosity, and/or extend context to 256K+. DSA makes that economical, in theory.

RT @proxy_vector: @arvidkahl learned this the hard way early on had a trial user with a gmail account asking super basic questions. almost…

V3 base was a pivotal moment for the industry. Still crazy how even well-informed people failed to appreciate the research velocity it implied. «Oh, cute! innovation under constraints! We'll do MoEs and efficient attention too, heh». No bruh, take it seriously.