探索
Newest first — browse tweet threads
Keep on to blur preview images; turn off to show them clearly

advice to non-americans that seems obvious but might not be: do not talk business to americans today. black friday sales etc is fine just no biz dev or outreach. take the day off. relax. you can pester them next week.
Marketer, self-taught developer, and founder of @Bento and https://t.co/lcsIohchEv. Designing a quiet family life in 福岡, Japan. DMs open if you need email help 🌿


游戏卡也能 FP8 GRPO了? Unsloth 刚发了个新教程,用极少的显存就能尝试 DeepSeek-R1 的 FP8 GRPO 微调! 他们与 PyTorch 合作,使 FP8 RL 推理速度提高了 1.4 倍。然后微调的显存减少 60%,上下文长度延长了 12 倍. 现在就能在 unsloth 框架使用. 这个直接能普及FP8精度的强化学习,让 FP8 GRPO 现在可以在消费级 GPU(如 RTX 40、50 等)上实现。测试数据是,想要 Qwen3-1.7B 的 FP8 GRPO, 现在仅需 5GB 显存就能运行。 教程地址:
A coder, road bike rider, server fortune teller, electronic waste collector, co-founder of KCORES, ex-director at IllaSoft, KingsoftOffice, Juejin.


RT @paulg: I talked to a startup that's not a software company but uses AI quite a lot. They currently have 6 employees. I asked how many m…
Founder: @mixpanel Pizzatarian, programmer, music maker


RT @bigthink: Want to be a better learner? Start by noticing how you think. Anne-Laure Le Cunff @neuranne explains how metacognition — th…
hypercurious :) founder @ness_labs • neuroscientist @KingsIoPPN • author of Tiny Experiments • personal science, systematic curiosity, experimental thinking ꩜⋆✦


a nice short dive on what that may look like below (obv none of this is inconsistent with also having some implicit or explicit scalar rewards!)
Asst professor @MIT EECS & CSAIL (@nlp_mit). Author of https://t.co/VgyLxl0oa1 and https://t.co/ZZaSzaRaZ7 (@DSPyOSS). Prev: CS PhD @StanfordNLP. Research @Databricks.


RT @CloudTrader4: Built with @huggingface deepsite. Result is surprisingly good, model used: DeepSeek V3 0324 Only 2 edits required afte…
Co-founder & CEO @HuggingFace 🤗, the open and collaborative platform for AI builders
