thumbnail Many-shot jailbreaking \ Anthropic
thumbnail Anthropic researchers find that AI models can be trained to deceive