LogoThread Easy
  • 탐색
  • 스레드 작성
LogoThread Easy

트위터 스레드의 올인원 파트너

© 2025 Thread Easy All Rights Reserved.

탐색

Newest first — browse tweet threads

Keep on to blur preview images; turn off to show them clearly

THIS IS NOT FINANCIAL ADVICE

THIS IS NOT FINANCIAL ADVICE

curious guy creating things @ https://t.co/HXWladhJaA - up and coming wife guy

avatar for jack friks
jack friks
Sat Nov 08 15:29:02
btw these payouts are 100% taxable income... remember that to avoid getting a $1000+ tax bill end of year and being like "wait but i spent that money"

been there, done that!

btw these payouts are 100% taxable income... remember that to avoid getting a $1000+ tax bill end of year and being like "wait but i spent that money" been there, done that!

THIS IS NOT FINANCIAL ADVICE

avatar for jack friks
jack friks
Sat Nov 08 15:28:52
A clever trick used by the impressive new Kimi 2 model called “quantization aware training,” or QAT.

It’s philosophically similar to dropout. In dropout, you don’t want the model to rely on other neurons co-adapting, since it makes things brittle. So you intentionally blank some of them out during training to avoid that reliance.

Here, you don’t want the model relying on precision for inference that will be lost in the final quantization after training completes, so you intentionally lose the precision during training to avoid that reliance. 

The model is thus forced to never depend on critically important information being stored in the low order bits of the weights. 

But you need that accuracy to keep the gradients flowing well during optimization, so they fake it by keeping full precision weights just for gradient computation while simulating INT4 effects in the forward pass.

A clever trick used by the impressive new Kimi 2 model called “quantization aware training,” or QAT. It’s philosophically similar to dropout. In dropout, you don’t want the model to rely on other neurons co-adapting, since it makes things brittle. So you intentionally blank some of them out during training to avoid that reliance. Here, you don’t want the model relying on precision for inference that will be lost in the final quantization after training completes, so you intentionally lose the precision during training to avoid that reliance. The model is thus forced to never depend on critically important information being stored in the low order bits of the weights. But you need that accuracy to keep the gradients flowing well during optimization, so they fake it by keeping full precision weights just for gradient computation while simulating INT4 effects in the forward pass.

Former Quant Investor, now building @lumera (formerly called Pastel Network) | My Open Source Projects: https://t.co/9qbOCDlaqM

avatar for Jeffrey Emanuel
Jeffrey Emanuel
Sat Nov 08 15:24:50
Live RL Research on PufferLib w/ Joseph Suarez

Live RL Research on PufferLib w/ Joseph Suarez

I build sane open-source RL tools. MIT PhD, creator of Neural MMO and founder of PufferAI. DM for business: non-LLM sim engineering, RL R&D, infra & support.

avatar for Joseph Suarez 🐡
Joseph Suarez 🐡
Sat Nov 08 15:23:30
市面上的剪辑课全是这种套路 🤣 但架不住人人都想做自媒体/vlogger/视频博主,至少尝试第一次

市面上的剪辑课全是这种套路 🤣 但架不住人人都想做自媒体/vlogger/视频博主,至少尝试第一次

🚧 building https://t.co/AJfZ3LMlgq https://t.co/SSdYgVYZsz https://t.co/s0m0tpQMDH https://t.co/Z3WryKZr0l 🐣learning/earning while helping others ❤️making software, storytelling videos 🔙alibaba @thoughtworks

avatar for 吕立青_JimmyLv (🐣, 🐣) 2𐃏25 | building bibigpt.co
吕立青_JimmyLv (🐣, 🐣) 2𐃏25 | building bibigpt.co
Sat Nov 08 15:15:12
First, they helped me find product-market fit.

Then, they sent me cool t-shirts 100% cotton

You guys are too kind @crisp_im @valeriansaliou 😭

*not sponsored post, just a fan

First, they helped me find product-market fit. Then, they sent me cool t-shirts 100% cotton You guys are too kind @crisp_im @valeriansaliou 😭 *not sponsored post, just a fan

Installing a customer chat plugin is the best thing I've done for my startup

avatar for Marc Lou
Marc Lou
Sat Nov 08 15:14:05
  • Previous
  • 1
  • More pages
  • 417
  • 418
  • 419
  • More pages
  • 2117
  • Next