Recherche : [Models] - Cyberveille

All Major Gen-AI Models Vulnerable to ‘Policy Puppetry’ Prompt Injection Attack https://www.securityweek.com/all-major-gen-ai-models-vulnerable-to-policy-puppetry-prompt-injection-attack/

25/04/2025 21:42:03

A new attack technique named Policy Puppetry can break the protections of major gen-AI models to produce harmful outputs.

Anthropic researchers find that AI models can be trained to deceive https://techcrunch.com/2024/01/13/anthropic-researchers-find-that-ai-models-can-be-trained-to-deceive/

15/01/2024 06:44:13

A study co-authored by researchers at Anthropic finds that AI models can be trained to deceive -- and that this deceptive behavior is difficult to combat.

Liens par page

Filtres