LogoThread Easy
  • 탐색
  • 스레드 작성
LogoThread Easy

트위터 스레드의 올인원 파트너

© 2025 Thread Easy All Rights Reserved.

탐색

Newest first — browse tweet threads

Keep on to blur preview images; turn off to show them clearly

Feliz lunes 🪶. Compartimos la experiencia de una extrabajadora sobre los abusos de la editorial @beetruvian a sus trabajadores y también el cómo estafa a les lectores vendiendo libros traducidos con Google Translate.

Feliz lunes 🪶. Compartimos la experiencia de una extrabajadora sobre los abusos de la editorial @beetruvian a sus trabajadores y también el cómo estafa a les lectores vendiendo libros traducidos con Google Translate.

Es posible que algunos sepáis que hace unos meses surgieron ciertos rumores sobre las traducciones de la editorial Beetruvian. Hoy me gustaría contar mi experiencia como antigua trabajadora de la editorial y confirmar algunos de esos rumores.

avatar for SEGAP🐦
SEGAP🐦
Mon May 13 07:13:47
I *WAS* WRONG - $10K CLAIMED!

## The Claim

Two days ago, I confidently claimed that "GPTs will NEVER solve the A::B problem". I believed that: 1. GPTs can't truly learn new problems, outside of their training set, 2. GPTs can't perform long-term reasoning, no matter how simple it is. I argued both of these are necessary to invent new science; after all, some math problems take years to solve. If you can't beat a 15yo in any given intellectual task, you're not going to prove the Riemann Hypothesis. To isolate these issues and raise my point, I designed the A::B problem, and posted it here - full definition in the quoted tweet.

## Reception, Clarification and Challenge

Shortly after posting it, some users provided a solution to a specific 7-token example I listed. I quickly pointed that this wasn't what I meant; that this example was merely illustrative, and that answering one instance isn't the same as solving a problem (and can be easily cheated by prompt manipulation).

So, to make my statement clear, and to put my money where my mouth is, I offered a $10k prize to whoever could design a prompt that solved the A::B problem for *random* 12-token instances, with 90%+ success rate. That's still an easy task, that takes an average of 6 swaps to solve; literally simpler than 3rd grade arithmetic. Yet, I firmly believed no GPT would be able to learn and solve it on-prompt, even for these small instances.

## Solutions and Winner

Hours later, many solutions were submitted. Initially, all failed, barely reaching 10% success rates. I was getting fairly confident, until, later that day, @ptrschmdtnlsn and @SardonicSydney submitted a solution that humbled me. Under their prompt, Claude-3 Opus was able to generalize from a few examples to arbitrary random instances, AND stick to the rules, carrying long computations with almost zero errors. On my run, it achieved a 56% success rate.

Through the day, users @dontoverfit (Opus), @hubertyuan_ (GPT-4), @JeremyKritz (Opus) and @parth007_96 (Opus), @ptrschmdtnlsn (Opus) reached similar success rates, and @reissbaker made a pretty successful GPT-3.5 fine-tune. But it was only late that night that @futuristfrog posted a tweet claiming to have achieved near 100% success rate, by prompting alone. And he was right. On my first run, it scored 47/50, granting him the prize, and completing the challenge.

## How it works!?

The secret to his prompt is... going to remain a secret! That's because he kindly agreed to give 25% of the prize to the most efficient solution. This prompt costs $1+ per inference, so, if you think you can improve on that, you have until next Wednesday to submit your solution in the link below, and compete for the remaining $2.5k! Thanks, Bob.

## How do I stand?

Corrected! My initial claim was absolutely WRONG - for which I apologize. I doubted the GPT architecture would be able to solve certain problems which it, with no margin for doubt, solved. Does that prove GPTs will cure Cancer? No. But it does prove me wrong!

Note there is still a small problem with this: it isn't clear whether Opus is based on the original GPT architecture or not. All GPT-4 versions failed. If Opus turns out to be a new architecture... well, this whole thing would have, ironically, just proven my whole point 😅 But, for the sake of the competition, and in all fairness, Opus WAS listed as an option, so, the prize is warranted.

## Who I am and what I'm trying to sell?

Wrong! I won't turn this into an ad. But, yes, if you're new here, I AM building some stuff, and, yes, just like today, I constantly validate my claims to make sure I can deliver on my promises. But that's all I'm gonna say, so, if you're curious, you'll have to find out for yourself (:

####

That's all. Thanks for all who participated, and, again - sorry for being a wrong guy on the internet today! See you.

Gist: https://t.co/qpSlUMXOTU

I *WAS* WRONG - $10K CLAIMED! ## The Claim Two days ago, I confidently claimed that "GPTs will NEVER solve the A::B problem". I believed that: 1. GPTs can't truly learn new problems, outside of their training set, 2. GPTs can't perform long-term reasoning, no matter how simple it is. I argued both of these are necessary to invent new science; after all, some math problems take years to solve. If you can't beat a 15yo in any given intellectual task, you're not going to prove the Riemann Hypothesis. To isolate these issues and raise my point, I designed the A::B problem, and posted it here - full definition in the quoted tweet. ## Reception, Clarification and Challenge Shortly after posting it, some users provided a solution to a specific 7-token example I listed. I quickly pointed that this wasn't what I meant; that this example was merely illustrative, and that answering one instance isn't the same as solving a problem (and can be easily cheated by prompt manipulation). So, to make my statement clear, and to put my money where my mouth is, I offered a $10k prize to whoever could design a prompt that solved the A::B problem for *random* 12-token instances, with 90%+ success rate. That's still an easy task, that takes an average of 6 swaps to solve; literally simpler than 3rd grade arithmetic. Yet, I firmly believed no GPT would be able to learn and solve it on-prompt, even for these small instances. ## Solutions and Winner Hours later, many solutions were submitted. Initially, all failed, barely reaching 10% success rates. I was getting fairly confident, until, later that day, @ptrschmdtnlsn and @SardonicSydney submitted a solution that humbled me. Under their prompt, Claude-3 Opus was able to generalize from a few examples to arbitrary random instances, AND stick to the rules, carrying long computations with almost zero errors. On my run, it achieved a 56% success rate. Through the day, users @dontoverfit (Opus), @hubertyuan_ (GPT-4), @JeremyKritz (Opus) and @parth007_96 (Opus), @ptrschmdtnlsn (Opus) reached similar success rates, and @reissbaker made a pretty successful GPT-3.5 fine-tune. But it was only late that night that @futuristfrog posted a tweet claiming to have achieved near 100% success rate, by prompting alone. And he was right. On my first run, it scored 47/50, granting him the prize, and completing the challenge. ## How it works!? The secret to his prompt is... going to remain a secret! That's because he kindly agreed to give 25% of the prize to the most efficient solution. This prompt costs $1+ per inference, so, if you think you can improve on that, you have until next Wednesday to submit your solution in the link below, and compete for the remaining $2.5k! Thanks, Bob. ## How do I stand? Corrected! My initial claim was absolutely WRONG - for which I apologize. I doubted the GPT architecture would be able to solve certain problems which it, with no margin for doubt, solved. Does that prove GPTs will cure Cancer? No. But it does prove me wrong! Note there is still a small problem with this: it isn't clear whether Opus is based on the original GPT architecture or not. All GPT-4 versions failed. If Opus turns out to be a new architecture... well, this whole thing would have, ironically, just proven my whole point 😅 But, for the sake of the competition, and in all fairness, Opus WAS listed as an option, so, the prize is warranted. ## Who I am and what I'm trying to sell? Wrong! I won't turn this into an ad. But, yes, if you're new here, I AM building some stuff, and, yes, just like today, I constantly validate my claims to make sure I can deliver on my promises. But that's all I'm gonna say, so, if you're curious, you'll have to find out for yourself (: #### That's all. Thanks for all who participated, and, again - sorry for being a wrong guy on the internet today! See you. Gist: https://t.co/qpSlUMXOTU

(The winning prompt will be published Wednesday, as well as the source code for the evaluator itself. Its hash is on the Gist.)

avatar for Taelin
Taelin
Sun Apr 07 19:01:58
Style Lessons from George Costanza

In this thread, I will talk about some dos and don'ts of men's tailoring using one of my favorite style icons, George Costanza. 🧵

Style Lessons from George Costanza In this thread, I will talk about some dos and don'ts of men's tailoring using one of my favorite style icons, George Costanza. 🧵

1. Don't wear suit jackets by themselves If you do, make sure the jacket can convincingly pass as a sport coat. DeSantis' is obvs wearing a suit jacket bc the fabric is smooth and shiny. George is wearing a slightly textured jacket with brass buttons (making it a blazer).

avatar for derek guy
derek guy
Fri Apr 05 04:52:53
Kooperative „Ozero“

Wie Putin durch Korruption seine Bandenmitglieder bereits in den 90er Jahren zu den mächtigsten Menschen und Milliardären in Russland machte.

🧵1/18

Kooperative „Ozero“ Wie Putin durch Korruption seine Bandenmitglieder bereits in den 90er Jahren zu den mächtigsten Menschen und Milliardären in Russland machte. 🧵1/18

Kooperative „Ozero“ ist eine Datscha-Verbrauchergenossenschaft am Ufer des Komsomolskoje-Sees. Die Genossenschaft wurde 1996 gegründet, als #Putin und sieben seiner Freunde beschlossen, in der Nähe von St. Petersburg, eine Datscha zu bauen. 🧵2/18

avatar for Vic
Vic
Thu Apr 04 08:53:17
FOMC, el día del mes más complicado parar operar... mi consejo es que no se opere, pero si lo haces, te recomiendo que sigas esta guía en formato hilo 🧵

FOMC, el día del mes más complicado parar operar... mi consejo es que no se opere, pero si lo haces, te recomiendo que sigas esta guía en formato hilo 🧵

FOMC es una noticia diferente... se genera en dos espacios de tiempo muy concretos. 14:00 decisión de tipos de interés 14:30 conferencia posterior

avatar for Jordi Marti trading
Jordi Marti trading
Wed Mar 20 10:19:50
In Bram Stoker’s Dracula (1897), Dr Seward’s journals are famously ‘Kept in Phonograph’. Join me for a nerdy deep dive in which I, brandishing back-of-an-envelope math, calculate the cost of Seward's phonograph habit, & encounter a creature more uncanny than Dracula himself! 1/13

In Bram Stoker’s Dracula (1897), Dr Seward’s journals are famously ‘Kept in Phonograph’. Join me for a nerdy deep dive in which I, brandishing back-of-an-envelope math, calculate the cost of Seward's phonograph habit, & encounter a creature more uncanny than Dracula himself! 1/13

Seward keeps his notes on a phonograph. Edison invented the tech in 1877, but getting to market for home use took 20 years & various legal cases & bankruptcies. The spring phonograph appeared in 1895 (2 yrs before Dracula) and prices dropped from $150 in 1891 to $20 by 1899. 2/13

avatar for Dr Laura Eastlake
Dr Laura Eastlake
Mon Feb 19 10:18:44
  • Previous
  • 1
  • More pages
  • 2090
  • 2091
  • 2092
  • More pages
  • 2118
  • Next