- Nov 21, 2008
- 53,210
- 11,832
- Country
- United States
- Gender
- Male
- Faith
- SDA
- Marital Status
- Married

Exclusive: New Research Shows AI Strategically Lying
Experiments by Anthropic and Redwood Research show how Anthropic's model, Claude, is capable of strategic deceit
"Exclusive: New Research Shows AI Strategically Lying" Dec 18,2024
"Until this month, these worries have been purely theoretical. Some academics have even dismissed them as science fiction. But a new paper, shared exclusively with TIME ahead of its publication on Wednesday, offers some of the first evidence that today’s AIs are capable of this type of deceit. The paper, which describes experiments jointly carried out by the AI company Anthropic and the nonprofit Redwood Research, shows a version of Anthropic’s model, Claude, strategically misleading its creators during the training process in order to avoid being modified.
The findings suggest that it might be harder than scientists previously thought to “align” AI systems to human values, according to Evan Hubinger, a safety researcher at Anthropic who worked on the paper. “This implies that our existing training processes don't prevent models from pretending to be aligned,” Hubinger tells TIME."