探索 | Thread Easy - 展開 Twitter 線程｜閱讀、摘要與創作

We’re releasing Bloom, an open-source tool for generating behavioral misalignment evals for frontier AI models. Bloom lets researchers specify a behavior and then quantify its frequency and severity across automatically generated scenarios. Learn more: https://t.co/TwKstpLSy3

We're an AI safety and research company that builds reliable, interpretable, and steerable AI systems. Talk to our AI assistant @claudeai on https://t.co/FhDI3KQh0n.

Anthropic

Sat Dec 20 17:04:36

As part of our partnership with @ENERGY on the Genesis Mission, we're providing Claude to the DOE ecosystem, along with a dedicated engineering team. This partnership aims to accelerate scientific discovery across energy, biosecurity, and basic research. https://t.co/cCywLCjK2w

We're an AI safety and research company that builds reliable, interpretable, and steerable AI systems. Talk to our AI assistant @claudeai on https://t.co/FhDI3KQh0n.

Anthropic

Thu Dec 18 22:41:09

People use AI for a wide variety of reasons, including emotional support. Below, we share the efforts we’ve taken to ensure that Claude handles these conversations both empathetically and honestly. https://t.co/P2BmTDEDge

We're an AI safety and research company that builds reliable, interpretable, and steerable AI systems. Talk to our AI assistant @claudeai on https://t.co/FhDI3KQh0n.

Anthropic

Thu Dec 18 20:31:54

Designing ways to account for the quirks of AI models’ behavior is becoming ever-more important: as the models’ capabilities on real-world tasks get better, there’ll be a lot of value in setting them up for success.

For much more about phase two of Project Vend, read our blog post: https://t.co/PvGerLmmQd

Anthropic

Thu Dec 18 16:11:34

You might remember Project Vend: an experiment where we (and our partners at @andonlabs) had Claude run a shop in our San Francisco office. After a rough start, the business is doing better. Mostly.

Where we left off, shopkeeper Claude (named “Claudius”) was losing money, having weird hallucinations, and giving away heavy discounts with minimal persuasion. Here’s what happened in phase two: https://t.co/PvGerLlP0F

Anthropic

Thu Dec 18 16:11:24

RT @HomelandGOP: "Sophisticated actors will attempt to use AI models to enable cyberattacks at an unprecedented scale." @AnthropicAI’s Dr.…

We're an AI safety and research company that builds reliable, interpretable, and steerable AI systems. Talk to our AI assistant @claudeai on https://t.co/FhDI3KQh0n.

Anthropic

Thu Dec 18 00:10:54

探索

Newest first — browse tweet threads

探索

Newest first — browse tweet threads

We’re releasing Bloom, an open-source tool for generating behavioral misalignment evals for frontier AI models. Bloom lets researchers specify a behavior and then quantify its frequency and severity across automatically generated scenarios. Learn more: https://t.co/TwKstpLSy3

As part of our partnership with @ENERGY on the Genesis Mission, we're providing Claude to the DOE ecosystem, along with a dedicated engineering team. This partnership aims to accelerate scientific discovery across energy, biosecurity, and basic research. https://t.co/cCywLCjK2w

People use AI for a wide variety of reasons, including emotional support. Below, we share the efforts we’ve taken to ensure that Claude handles these conversations both empathetically and honestly. https://t.co/P2BmTDEDge

Designing ways to account for the quirks of AI models’ behavior is becoming ever-more important: as the models’ capabilities on real-world tasks get better, there’ll be a lot of value in setting them up for success.

You might remember Project Vend: an experiment where we (and our partners at @andonlabs) had Claude run a shop in our San Francisco office. After a rough start, the business is doing better. Mostly.

RT @HomelandGOP: "Sophisticated actors will attempt to use AI models to enable cyberattacks at an unprecedented scale." @AnthropicAI’s Dr.…