Explorer
Composer un thread

Explorer

Newest first — browse tweet threads

Thread Easy

Votre partenaire tout-en-un pour les threads Twitter

Explorer

Newest first — browse tweet threads

Author handle

From date

To date

Blur thumbnails

Keep on to blur preview images; turn off to show them clearly

This research was led by @_igorshilov as part of the Anthropic Fellows Program.

We're an AI safety and research company that builds reliable, interpretable, and steerable AI systems. Talk to our AI assistant @claudeai on https://t.co/FhDI3KQh0n.

Anthropic

Tue Dec 09 19:47:25

Read the full paper on SGTM here: https://t.co/Zfg2tjX7hD For reproducibility, we’ve also made the relevant code available on GitHub: https://t.co/zRmJYy6bDE.

This research was led by @_igorshilov as part of the Anthropic Fellows Program.

Anthropic

Tue Dec 09 19:47:25

New research from Anthropic Fellows Program: Selective GradienT Masking (SGTM). We study how to train models so that high-risk knowledge (e.g. about dangerous weapons) is isolated in a small, separate set of parameters that can be removed without broadly affecting the model.

SGTM splits the model’s weights into “retain” and “forget” subsets, and guides specific knowledge into the “forget” subset during pretraining. It can then be removed before deployment in high-risk settings. Read more: https://t.co/BfR4Kd86b0

Anthropic

Tue Dec 09 19:47:22

ICML and NeurIPS should merge into a single conference called ICPS (International Conference on Plagiarizing Schmidhuber).

Professor of computer science at UW and author of '2040' and 'The Master Algorithm'. Into machine learning, AI, and anything that makes me curious.

Pedro Domingos

Tue Dec 09 19:46:50

RT @mrdrozdov: Huge props to @kristahopsalong who I've firsthand witnessed pour a massive amount of energy and passion into this project!…

Asst professor @MIT EECS & CSAIL (@nlp_mit). Author of https://t.co/VgyLxl0oa1 and https://t.co/ZZaSzaRaZ7 (@DSPyOSS). Prev: CS PhD @StanfordNLP. Research @Databricks.

Omar Khattab

Tue Dec 09 19:40:11

RT @mrdrozdov: There's a huge misalignment between the datasets models are trained and evaluated against, and the real world questions that…

Asst professor @MIT EECS & CSAIL (@nlp_mit). Author of https://t.co/VgyLxl0oa1 and https://t.co/ZZaSzaRaZ7 (@DSPyOSS). Prev: CS PhD @StanfordNLP. Research @Databricks.

Omar Khattab

Tue Dec 09 19:40:07

Previous
1
1162
1163
1164
5634
Next