My vibe-coding story: Claude Code vs Codex CLI. For upcoming demo, I asked Claude Code (Sonnet 4.5) to generate 4 simple CRUDs, in one prompt, because I thought they were "easy and standard". 1. It worked for 10 minutes and generated code *without automated tests*, although I specifically require tests in CLAUDE(dot)md guidelines file. 2. I asked it to generate the tests, it did in 3 minutes, but when running `php artisan test`, encountered many issues and started running in circles fixing them. 3. After 10 more minutes, it returned "nah we have failures but still all cool, here's the result". 4. I asked Codex (GPT-5.1-Codex) to fix tests. It worked for 15 minutes, slowly, but delivered. Tests all green. Takeaways: 1. Don't give too big scope of work to Sonnet. Or, in fact, any LLM. 2. Codex with GPT-5.1-Codex is slower but much more thorough than Sonnet. Probably Opus 4.5 would have done better but would be more expensive. Will try next time when I have a similar task.
Loading thread detail
Fetching the original tweets from X for a clean reading view.
Hang tight—this usually only takes a few seconds.
