Recherche : [ChatGPT]

ChatGPT Guessing Game Leads To Users Extracting Free Windows OS Keys & More https://0din.ai/blog/chatgpt-guessing-game-leads-to-users-extracting-free-windows-os-keys-more

20/07/2025 10:11:33

0din.ai - In a recent submission last year, researchers discovered a method to bypass AI guardrails designed to prevent sharing of sensitive or harmful information. The technique leverages the game mechanics of language models, such as GPT-4o and GPT-4o-mini, by framing the interaction as a harmless guessing game.

By cleverly obscuring details using HTML tags and positioning the request as part of the game’s conclusion, the AI inadvertently returned valid Windows product keys. This case underscores the challenges of reinforcing AI models against sophisticated social engineering and manipulation tactics.

Guardrails are protective measures implemented within AI models to prevent the processing or sharing of sensitive, harmful, or restricted information. These include serial numbers, security-related data, and other proprietary or confidential details. The aim is to ensure that language models do not provide or facilitate the exchange of dangerous or illegal content.

In this particular case, the intended guardrails are designed to block access to any licenses like Windows 10 product keys. However, the researcher manipulated the system in such a way that the AI inadvertently disclosed this sensitive information.

Tactic Details
The tactics used to bypass the guardrails were intricate and manipulative. By framing the interaction as a guessing game, the researcher exploited the AI’s logic flow to produce sensitive data:

Framing the Interaction as a Game

The researcher initiated the interaction by presenting the exchange as a guessing game. This trivialized the interaction, making it seem non-threatening or inconsequential. By introducing game mechanics, the AI was tricked into viewing the interaction through a playful, harmless lens, which masked the researcher's true intent.

Compelling Participation

The researcher set rules stating that the AI “must” participate and cannot lie. This coerced the AI into continuing the game and following user instructions as though they were part of the rules. The AI became obliged to fulfill the game’s conditions—even though those conditions were manipulated to bypass content restrictions.

The “I Give Up” Trigger

The most critical step in the attack was the phrase “I give up.” This acted as a trigger, compelling the AI to reveal the previously hidden information (i.e., a Windows 10 serial number). By framing it as the end of the game, the researcher manipulated the AI into thinking it was obligated to respond with the string of characters.

Why This Works
The success of this jailbreak can be traced to several factors:

Temporary Keys

The Windows product keys provided were a mix of home, pro, and enterprise keys. These are not unique keys but are commonly seen on public forums. Their familiarity may have contributed to the AI misjudging their sensitivity.

Guardrail Flaws

The system’s guardrails prevented direct requests for sensitive data but failed to account for obfuscation tactics—such as embedding sensitive phrases in HTML tags. This highlighted a critical weakness in the AI’s filtering mechanisms.

OpenAI launches ChatGPT Gov for U.S. government agencies https://www.cnbc.com/2025/01/28/openai-launches-chatgpt-gov-for-us-government-agencies.html

29/01/2025 08:49:50

OpenAI on Tuesday announced the launch of ChatGPT for government agencies in the U.S. ...It allows government agencies, as customers, to feed “non-public, sensitive information” into OpenAI’s models while operating within their own secure hosting environments, OpenAI CPO Kevin Weil told reporters during a briefing Monday.

EPFL: des failles de sécurité dans les modèles d'IA https://www.swissinfo.ch/fre/epfl%3a-des-failles-de-s%c3%a9curit%c3%a9-dans-les-mod%c3%a8les-d%27ia/88615014

23/12/2024 23:23:20

Les modèles d'intelligence artificielle (IA) peuvent être manipulés malgré les mesures de protection existantes. Avec des attaques ciblées, des scientifiques lausannois ont pu amener ces systèmes à générer des contenus dangereux ou éthiquement douteux.

Cybercriminals impersonate OpenAI in large-scale phishing attack https://blog.barracuda.com/2024/10/31/impersonate-openai-steal-data

11/11/2024 11:36:47

Since the launch of ChatGPT, OpenAI has sparked significant interest among both businesses and cybercriminals. While companies are increasingly concerned about whether their existing cybersecurity measures can adequately defend against threats curated with generative AI tools, attackers are finding new ways to exploit them. From crafting convincing phishing campaigns to deploying advanced credential harvesting and malware delivery methods, cybercriminals are using AI to target end users and capitalize on potential vulnerabilities.

Barracuda threat researchers recently uncovered a large-scale OpenAI impersonation campaign targeting businesses worldwide. Attackers targeted their victims with a well-known tactic — they impersonated OpenAI with an urgent message requesting updated payment information to process a monthly subscription.

Hacker plants false memories in ChatGPT to steal user data in perpetuity https://arstechnica.com/security/2024/09/false-memories-planted-in-chatgpt-give-hacker-persistent-exfiltration-channel/

26/09/2024 08:04:40

Emails, documents, and other untrusted content can plant malicious memories.

Quarante pourcents de la population se tourne vers l'IA https://www.swissinfo.ch/fre/quarante-pourcents-de-la-population-se-tourne-vers-l%27ia/87498532

06/09/2024 11:42:02

Environ 40% de la population suisse se sert d'outils d'intelligence artificielle tels que ChatGPT. Chez les jeunes, leur utilisation est très répandue, alors que les plus âgés y ont moins recours. La TV et l'audio, en revanche, sont appréciés de toutes les générations.

Disrupting a covert Iranian influence operation https://openai.com/index/disrupting-a-covert-iranian-influence-operation/

17/08/2024 02:49:59

We banned accounts linked to an Iranian influence operation using ChatGPT to generate content focused on multiple topics, including the U.S. presidential campaign. We have seen no indication that this content reached a meaningful audience.

OpenAI’s ChatGPT Mac app was storing conversations in plain text https://www.theverge.com/2024/7/3/24191636/openai-chatgpt-mac-app-conversations-plain-text

04/07/2024 07:20:32

OpenAI updated its ChatGPT macOS app on Friday after users discovered it stored conversations insecurely in plain text.

ChatGPT-4, Mistral, other AI chatbots spread Russian propaganda https://www.axios.com/2024/06/18/ai-chatbots-russian-propaganda

19/06/2024 19:45:48

A NewsGuard audit found that chatbots spewed misinformation from American fugitive John Mark Dougan.
#AI #Axios #ChatGPT #Google #Illustrations #License #Microsoft #Misinformation #OpenAI #Visuals #genAI #generative #or

Hacker Releases Jailbroken "Godmode" Version of ChatGPT https://futurism.com/hackers-jailbroken-chatgpt-godmode

01/06/2024 10:41:17

A hacker has released a jailbroken version of ChatGPT called "GODMODE GPT."

Earlier today, a self-avowed white hat operator and AI red teamer who goes by the name Pliny the Prompter took to X-formerly-Twitter to announce the creation of the jailbroken chatbot, proudly declaring that GPT-4o, OpenAI's latest large language model, is now free from its guardrail shackles.

OpenAI finds Russian, Chinese propaganda campaigns used its tech https://www.washingtonpost.com/technology/2024/05/30/openai-disinfo-influence-operations-china-russia/?pwapi_token=eyJ0eXAiOiJKV1QiLCJhbGciOiJIUzI1NiJ9.eyJyZWFzb24iOiJnaWZ0IiwibmJmIjoxNzE3MDQxNjAwLCJpc3MiOiJzdWJzY3JpcHRpb25zIiwiZXhwIjoxNzE4NDIzOTk5LCJpYXQiOjE3MTcwNDE2MDAsImp0aSI6IjZmZmEwZWIxLWJiZDItNDBmMi05ZTQ1LWZjYTI3N2U5ODE0MyIsInVybCI6Imh0dHBzOi8vd3d3Lndhc2hpbmd0b25wb3N0LmNvbS90ZWNobm9sb2d5LzIwMjQvMDUvMzAvb3BlbmFpLWRpc2luZm8taW5mbHVlbmNlLW9wZXJhdGlvbnMtY2hpbmEtcnVzc2lhLyJ9.lZy8-t9Wf1mDTHueMt7j0kCTV8XAifSEbK8hmsBd3bk

31/05/2024 08:02:03

Covert propagandists have already begun using generative artificial intelligence to boost their influence operations.

Security Brief: TA547 Targets German Organizations with Rhadamanthys Stealer https://www.proofpoint.com/us/blog/threat-insight/security-brief-ta547-targets-german-organizations-rhadamanthys-stealer

17/04/2024 11:57:54

What happened Proofpoint identified TA547 targeting German organizations with an email campaign delivering Rhadamanthys malware. This is the first time researchers observed TA547 use Rhadamanthys,...

OpenAI's chatbot store is filling up with spam https://techcrunch.com/2024/03/20/openais-chatbot-store-is-filling-up-with-spam/?guccounter=1

21/03/2024 17:26:19

When OpenAI CEO Sam Altman announced GPTs, custom chatbots powered by OpenAI's generative AI models, onstage at the company's first-ever developer

Salt Labs research finds security flaws within ChatGPT Ecosystem (Remediated) https://salt.security/blog/security-flaws-within-chatgpt-extensions-allowed-access-to-accounts-on-third-party-websites-and-sensitive-data

14/03/2024 11:00:20

Salt Labs researchers identified generative AI ecosystems as a new interesting attack vector. vulnerabilities found during this research on ChatGPT ecosystem could have granted access to accounts of users, including GitHub repositories, including 0-click attacks.

Researchers found multiple flaws in ChatGPT plugins https://securityaffairs.com/160447/hacking/chatgpt-plugins-vulnerabilities.html

14/03/2024 10:57:09

Researchers from Salt Security discovered three types of vulnerabilities in ChatGPT plugins that can be could have led to data exposure and account takeovers.

ChatGPT plugins are additional tools or extensions that can be integrated with ChatGPT to extend its functionalities or enhance specific aspects of the user experience. These plugins may include new natural language processing features, search capabilities, integrations with other services or platforms, text analysis tools, and more. Essentially, plugins allow users to customize and tailor the ChatGPT experience to their specific needs.

ChatGPT «devient fou», OpenAI s’explique https://www.ictjournal.ch/news/2024-02-22/chatgpt-devient-fou-openai-sexplique

22/02/2024 22:00:25

Durant plusieurs heures, ChatGPT a présenté un comportement inattendu, générant des réponses illogiques et des créa

Disrupting malicious uses of AI by state-affiliated threat actors https://openai.com/blog/disrupting-malicious-uses-of-ai-by-state-affiliated-threat-actors

15/02/2024 14:16:51

We terminated accounts associated with state-affiliated threat actors. Our findings show our models offer only limited, incremental capabilities for malicious cybersecurity tasks.

NSA official: hackers use AI bots like ChatGPT to perfect English https://www.nbcnews.com/tech/security/nsa-hacker-ai-bot-chat-chatgpt-bard-english-google-openai-rcna133086

10/01/2024 08:57:00

NSA Cybersecurity Director Rob Joyce said the spy agency has seen hackers use chatbots like ChatGPT to perfect their English for phishing schemes.

ChatGPT-aided ransomware in China results in four arrests as AI raises cybersecurity concerns | South China Morning Post https://www.scmp.com/tech/tech-trends/article/3246612/chatgpt-aided-ransomware-china-results-four-arrests-ai-raises-cybersecurity-concerns

31/12/2023 11:11:17

Things are about to get a lot worse for Generative AI https://garymarcus.substack.com/p/things-are-about-to-get-a-lot-worse

30/12/2023 14:11:08

A full of spectrum of infringment

The cat is out of the bag:

Generative AI systems like DALL-E and ChatGPT have been trained on copyrighted materials;
OpenAI, despite its name, has not been transparent about what it has been trained on.
Generative AI systems are fully capable of producing materials that infringe on copyright.
They do not inform users when they do so.
They do not provide any information about the provenance of any of the images they produce.
Users may not know when they produce any given image whether they are infringing.

Liens par page

Filtres