The bitter lesson for web agents The last 1 year has taught us a new bitter lesson that we think others are not yet grokking. Agents that look at the web like humans (using screenshots of sites) navigate and generalize better than agents that read code (HTML, DOM).
Why? Because the web is a mess. And websites are fundamentally built for human consumption. If you want to automate everything that a human can do with a browser, then you have to perceive like a human. Stay tuned for more in the coming weeks! Wonderfully written by @0xjasper in collaboration with the Yutori team. https://t.co/fwWYvPNuU2
