LogoThread Easy
  • Explorar
  • Componer hilo
LogoThread Easy

Tu compañero integral para hilos de Twitter

© 2025 Thread Easy All Rights Reserved.

Explorar

Newest first — browse tweet threads

Keep on to blur preview images; turn off to show them clearly

someone tell me what i'm missing here, because the titled claim seems trivially false to me:

they define an LLM as a function that maps sequence s in V^k to vector in R^d

assume hidden state in n-bit precision.  at some point, there are more inputs possible than hidden states:

|V|^k > 2^{n * d}
k > n d log(2) / log |V|

let's take GPT-2: n=16, d=768, V≈50,000

then collisions *must* happen starting at a context window size of 214 tokens

this seems actually kind of bad, right?

someone tell me what i'm missing here, because the titled claim seems trivially false to me: they define an LLM as a function that maps sequence s in V^k to vector in R^d assume hidden state in n-bit precision. at some point, there are more inputs possible than hidden states: |V|^k > 2^{n * d} k > n d log(2) / log |V| let's take GPT-2: n=16, d=768, V≈50,000 then collisions *must* happen starting at a context window size of 214 tokens this seems actually kind of bad, right?

phd research @cornell // language models, information theory, science of AI

avatar for Jack Morris
Jack Morris
Thu Oct 30 15:50:00
June - Container launch
Nov 6 - Containers live in prod 

Come to our TECH Talk: https://t.co/Jx2ayskkta

June - Container launch Nov 6 - Containers live in prod Come to our TECH Talk: https://t.co/Jx2ayskkta

Have questions, or building something cool with Cloudflare's Developer products? We're here to help. For help with your account please try @CloudflareHelp

avatar for Cloudflare Developers
Cloudflare Developers
Thu Oct 30 15:46:53
RT @bigeagle_xd: i am honored to have witnessed this great work over the past year.  
linear attn has great potential in expressiveness but…

RT @bigeagle_xd: i am honored to have witnessed this great work over the past year. linear attn has great potential in expressiveness but…

We're in a race. It's not USA vs China but humans and AGIs vs ape power centralization. @deepseek_ai stan #1, 2023–Deep Time «C’est la guerre.» ®1

avatar for Teortaxes▶️ (DeepSeek 推特🐋铁粉 2023 – ∞)
Teortaxes▶️ (DeepSeek 推特🐋铁粉 2023 – ∞)
Thu Oct 30 15:46:25
Wow. Feel seen.

Wow. Feel seen.

AI @amazon. All views personal!

avatar for GDP
GDP
Thu Oct 30 15:44:30
RT @RobbieSap90: Did you know Workers Builds are built on top of  Containers?

We're able to build the best product for developers when we…

RT @RobbieSap90: Did you know Workers Builds are built on top of Containers? We're able to build the best product for developers when we…

Have questions, or building something cool with Cloudflare's Developer products? We're here to help. For help with your account please try @CloudflareHelp

avatar for Cloudflare Developers
Cloudflare Developers
Thu Oct 30 15:43:53
Me and my 9yr old daughter have a new hobby now.

Me and my 9yr old daughter have a new hobby now.

~20 yrs in web-dev, now mostly Laravel. My Laravel courses: https://t.co/HRUAJdMRZL My Youtube channel: https://t.co/qPQAkaov2F

avatar for Povilas Korop | Laravel Courses Creator & Youtuber
Povilas Korop | Laravel Courses Creator & Youtuber
Thu Oct 30 15:43:37
  • Previous
  • 1
  • More pages
  • 1717
  • 1718
  • 1719
  • More pages
  • 2111
  • Next