Keep on to blur preview images; turn off to show them clearly

OfficeQA is neat because we believe any new grad can do the tasks reliably, but it highlights the challenges enterprises have with AI. Elaborate agents with our latest document AI tools do a bit better, but there is still plenty of headroom. We hope researchers find this useful!


CTO at @Databricks and CS prof at @UCBerkeley. Working on data+AI, including @ApacheSpark, @DeltaLakeOSS, @MLflow, @DSPyOSS. https://t.co/nmRYAKFsWr


OfficeQA stands in contrast to "superintelligence" benchmarks that test esoteric or abstract knowledge but do not necessarily translate into better performance on real work. One way to view it is "can ASI make it through one day at the office?"


CTO at @Databricks and CS prof at @UCBerkeley. Working on data+AI, including @ApacheSpark, @DeltaLakeOSS, @MLflow, @DSPyOSS. https://t.co/nmRYAKFsWr


CTO at @Databricks and CS prof at @UCBerkeley. Working on data+AI, including @ApacheSpark, @DeltaLakeOSS, @MLflow, @DSPyOSS. https://t.co/nmRYAKFsWr


CTO at @Databricks and CS prof at @UCBerkeley. Working on data+AI, including @ApacheSpark, @DeltaLakeOSS, @MLflow, @DSPyOSS. https://t.co/nmRYAKFsWr
