探索
線程創作

Thread Easy

Twitter 線程的一站式夥伴

© 2025 Thread Easy All Rights Reserved.

探索

Newest first — browse tweet threads

Author handle

From date

To date

Blur thumbnails

Keep on to blur preview images; turn off to show them clearly

gonna miss this model when they ruthlessly murder it in like 8 months b/c it is really something and im amazed it was even shipped

image source: https://t.co/KhmwNdvxop

gonna miss this model when they ruthlessly murder it in like 8 months b/c it is really something and im amazed it was even shipped image source: https://t.co/KhmwNdvxop

i make things and do research https://t.co/jZh799yfRw / https://t.co/IdaJwZJ57O

Thu Dec 18 22:28:17

Smart
I worked two jobs in my final year. One for a software company and the second one for Microsoft. Microsoft compensated us with couchers as they were not allowed to pay directly at that time.

Cost me a first class, but that Microsoft opened doors ooo. Choose your hard.

Also, the first job ended up leading to me writing three books and beign the fella yall wana take pics with?

Smart I worked two jobs in my final year. One for a software company and the second one for Microsoft. Microsoft compensated us with couchers as they were not allowed to pay directly at that time. Cost me a first class, but that Microsoft opened doors ooo. Choose your hard. Also, the first job ended up leading to me writing three books and beign the fella yall wana take pics with?

Founder | Author | Speaker Building @beltstripe. Healtech/EdTech/Agric I'm Not The Man Of Your Dreams. Your Imagination Wasn't This Great.

Thu Dec 18 22:25:20

I've translated the whole thing, you may find it curious in the current context.

I've translated the whole thing, you may find it curious in the current context.

We're in a race. It's not USA vs China but humans and AGIs vs ape power centralization. @deepseek_ai stan #1, 2023–Deep Time «C’est la guerre.» ®1

Teortaxes▶️ (DeepSeek 推特🐋铁粉 2023 – ∞)

Thu Dec 18 22:23:11

RT @coffeebreak_YT: Galaxy gas is often dismissed as a “drug”—just as prediction markets were once dismissed as “gambling.”

Like any new i…

RT @coffeebreak_YT: Galaxy gas is often dismissed as a “drug”—just as prediction markets were once dismissed as “gambling.” Like any new i…

i make things and do research https://t.co/jZh799yfRw / https://t.co/IdaJwZJ57O

Thu Dec 18 22:18:16

In 2008, Krylov has characterized American Hegemony as the natural extension of American project of a nation becoming God, whose *only* moral concern is reconciling and satisfying preferences of individual Americans.
I think they don't get how bleak it looked from the outside.

In 2008, Krylov has characterized American Hegemony as the natural extension of American project of a nation becoming God, whose only moral concern is reconciling and satisfying preferences of individual Americans. I think they don't get how bleak it looked from the outside.

I've translated the whole thing, you may find it curious in the current context.

Teortaxes▶️ (DeepSeek 推特🐋铁粉 2023 – ∞)

Thu Dec 18 22:15:38

So, I tried yesterday's prompts on Opus 4.5 & Codex 5.2.

Below are my conclusions (including receipts):

1. My prompts from yesterday were ill-defined. I was impatient, lazy, mean to the model, and basically expected Opus to read my mind. I don't have any evidence the model degraded in performance.

2. After patiently cleaning the prompt, both models succeeded at this (monster) task. They nailed the initial tests, took the same time (~30 mins / ~150k tokens), and somehow asked nearly identical follow up questions. (!)

3. GPT 5.2 produced better code where it mattered the most. Opus 4.5 made mistakes on Bruijn Index calculations, which is a serious logic error that it had to fix later. It also duplicated a massive function for no reason. GPT 5.2 got these right, and was more careful about edge cases that went over Opus's head.

I'll share the logs in the comments, including:
- the initial prompt
- the full chat
- the final results

It may be helpful to study how I constructed this prompt, because that's a hell of a task that was (finally) implemented successfully by the AI. I had to be super precise about certain details that confused Opus yesterday, and I'll now move these things to documentation. The lesson is: AIs are a great tool, but they are still limited by *you*. If your instructions are poor, they WILL fail.

Finally, I must be honest here: if I coded this manually, this would've taken a few hours, not two days. AI here was a net loss this time.

Also: you all put too much weight on my words, and I feel like my posts caused unnecessary trouble. Please, don't do that.

So, I tried yesterday's prompts on Opus 4.5 & Codex 5.2. Below are my conclusions (including receipts): 1. My prompts from yesterday were ill-defined. I was impatient, lazy, mean to the model, and basically expected Opus to read my mind. I don't have any evidence the model degraded in performance. 2. After patiently cleaning the prompt, both models succeeded at this (monster) task. They nailed the initial tests, took the same time (~30 mins / ~150k tokens), and somehow asked nearly identical follow up questions. (!) 3. GPT 5.2 produced better code where it mattered the most. Opus 4.5 made mistakes on Bruijn Index calculations, which is a serious logic error that it had to fix later. It also duplicated a massive function for no reason. GPT 5.2 got these right, and was more careful about edge cases that went over Opus's head. I'll share the logs in the comments, including: - the initial prompt - the full chat - the final results It may be helpful to study how I constructed this prompt, because that's a hell of a task that was (finally) implemented successfully by the AI. I had to be super precise about certain details that confused Opus yesterday, and I'll now move these things to documentation. The lesson is: AIs are a great tool, but they are still limited by you. If your instructions are poor, they WILL fail. Finally, I must be honest here: if I coded this manually, this would've taken a few hours, not two days. AI here was a net loss this time. Also: you all put too much weight on my words, and I feel like my posts caused unnecessary trouble. Please, don't do that.

Chat logs and results: https://t.co/VvtOkovKTY

Thu Dec 18 22:12:44

Previous
1
389
390
391
5634
Next