incredibly based. Now that we have fast long contexts and an abundance of compute, it's about damn time to explore byte-level models again. Meta has disappointed me with MegaByte, but Meta generally is bad at execution. This path is not yet closed…
Loading thread detail
Fetching the original tweets from X for a clean reading view.
Hang tight—this usually only takes a few seconds.
