thumbnail Anthropic researchers find that AI models can be trained to deceive