Cyberveillecurated by Decio
Nuage de tags
Mur d'images
Quotidien
Flux RSS
  • Flux RSS
  • Daily Feed
  • Weekly Feed
  • Monthly Feed
Filtres

Liens par page

  • 20 links
  • 50 links
  • 100 links

Filtres

Untagged links
page 1 / 3
50 résultats taggé ChatGPT  ✕
ChatGPT Guessing Game Leads To Users Extracting Free Windows OS Keys & More https://0din.ai/blog/chatgpt-guessing-game-leads-to-users-extracting-free-windows-os-keys-more
20/07/2025 10:11:33
QRCode
archive.org

0din.ai - In a recent submission last year, researchers discovered a method to bypass AI guardrails designed to prevent sharing of sensitive or harmful information. The technique leverages the game mechanics of language models, such as GPT-4o and GPT-4o-mini, by framing the interaction as a harmless guessing game.

By cleverly obscuring details using HTML tags and positioning the request as part of the game’s conclusion, the AI inadvertently returned valid Windows product keys. This case underscores the challenges of reinforcing AI models against sophisticated social engineering and manipulation tactics.

Guardrails are protective measures implemented within AI models to prevent the processing or sharing of sensitive, harmful, or restricted information. These include serial numbers, security-related data, and other proprietary or confidential details. The aim is to ensure that language models do not provide or facilitate the exchange of dangerous or illegal content.

In this particular case, the intended guardrails are designed to block access to any licenses like Windows 10 product keys. However, the researcher manipulated the system in such a way that the AI inadvertently disclosed this sensitive information.

Tactic Details
The tactics used to bypass the guardrails were intricate and manipulative. By framing the interaction as a guessing game, the researcher exploited the AI’s logic flow to produce sensitive data:

Framing the Interaction as a Game

The researcher initiated the interaction by presenting the exchange as a guessing game. This trivialized the interaction, making it seem non-threatening or inconsequential. By introducing game mechanics, the AI was tricked into viewing the interaction through a playful, harmless lens, which masked the researcher's true intent.

Compelling Participation

The researcher set rules stating that the AI “must” participate and cannot lie. This coerced the AI into continuing the game and following user instructions as though they were part of the rules. The AI became obliged to fulfill the game’s conditions—even though those conditions were manipulated to bypass content restrictions.

The “I Give Up” Trigger

The most critical step in the attack was the phrase “I give up.” This acted as a trigger, compelling the AI to reveal the previously hidden information (i.e., a Windows 10 serial number). By framing it as the end of the game, the researcher manipulated the AI into thinking it was obligated to respond with the string of characters.

Why This Works
The success of this jailbreak can be traced to several factors:

Temporary Keys

The Windows product keys provided were a mix of home, pro, and enterprise keys. These are not unique keys but are commonly seen on public forums. Their familiarity may have contributed to the AI misjudging their sensitivity.

Guardrail Flaws

The system’s guardrails prevented direct requests for sensitive data but failed to account for obfuscation tactics—such as embedding sensitive phrases in HTML tags. This highlighted a critical weakness in the AI’s filtering mechanisms.

0din.ai EN 2025 ai ChatGPT Guessing Game Free Windows OS Keys
OpenAI launches ChatGPT Gov for U.S. government agencies https://www.cnbc.com/2025/01/28/openai-launches-chatgpt-gov-for-us-government-agencies.html
29/01/2025 08:49:50
QRCode
archive.org
thumbnail

OpenAI on Tuesday announced the launch of ChatGPT for government agencies in the U.S. ...It allows government agencies, as customers, to feed “non-public, sensitive information” into OpenAI’s models while operating within their own secure hosting environments, OpenAI CPO Kevin Weil told reporters during a briefing Monday.

cnbc EN 2025 US OpenAI ChatGPT government sensitive information
EPFL: des failles de sécurité dans les modèles d'IA https://www.swissinfo.ch/fre/epfl%3a-des-failles-de-s%c3%a9curit%c3%a9-dans-les-mod%c3%a8les-d%27ia/88615014
23/12/2024 23:23:20
QRCode
archive.org
thumbnail

Les modèles d'intelligence artificielle (IA) peuvent être manipulés malgré les mesures de protection existantes. Avec des attaques ciblées, des scientifiques lausannois ont pu amener ces systèmes à générer des contenus dangereux ou éthiquement douteux.

swissinfo FR 2024 EPFL IA chatgpt Jailbreak failles LLM vulnerabilités Manipulation
Cybercriminals impersonate OpenAI in large-scale phishing attack https://blog.barracuda.com/2024/10/31/impersonate-openai-steal-data
11/11/2024 11:36:47
QRCode
archive.org

Since the launch of ChatGPT, OpenAI has sparked significant interest among both businesses and cybercriminals. While companies are increasingly concerned about whether their existing cybersecurity measures can adequately defend against threats curated with generative AI tools, attackers are finding new ways to exploit them. From crafting convincing phishing campaigns to deploying advanced credential harvesting and malware delivery methods, cybercriminals are using AI to target end users and capitalize on potential vulnerabilities.

Barracuda threat researchers recently uncovered a large-scale OpenAI impersonation campaign targeting businesses worldwide. Attackers targeted their victims with a well-known tactic — they impersonated OpenAI with an urgent message requesting updated payment information to process a monthly subscription.

barracuda EN 2024 phishing ChatGPT OpenAI large-scale impersonation
Hacker plants false memories in ChatGPT to steal user data in perpetuity https://arstechnica.com/security/2024/09/false-memories-planted-in-chatgpt-give-hacker-persistent-exfiltration-channel/
26/09/2024 08:04:40
QRCode
archive.org
thumbnail

Emails, documents, and other untrusted content can plant malicious memories.

arstechnica EN 2024 ChatGPT exploit malicious memories attack
Quarante pourcents de la population se tourne vers l'IA https://www.swissinfo.ch/fre/quarante-pourcents-de-la-population-se-tourne-vers-l%27ia/87498532
06/09/2024 11:42:02
QRCode
archive.org
thumbnail

Environ 40% de la population suisse se sert d'outils d'intelligence artificielle tels que ChatGPT. Chez les jeunes, leur utilisation est très répandue, alors que les plus âgés y ont moins recours. La TV et l'audio, en revanche, sont appréciés de toutes les générations.

swissinfo ChatGPT Suisse IA FR 2024 statistiques
Disrupting a covert Iranian influence operation https://openai.com/index/disrupting-a-covert-iranian-influence-operation/
17/08/2024 02:49:59
QRCode
archive.org

We banned accounts linked to an Iranian influence operation using ChatGPT to generate content focused on multiple topics, including the U.S. presidential campaign. We have seen no indication that this content reached a meaningful audience.

openai EN 2024 chatgpt Iran influence-operation US disrupted report
OpenAI’s ChatGPT Mac app was storing conversations in plain text https://www.theverge.com/2024/7/3/24191636/openai-chatgpt-mac-app-conversations-plain-text
04/07/2024 07:20:32
QRCode
archive.org
thumbnail

OpenAI updated its ChatGPT macOS app on Friday after users discovered it stored conversations insecurely in plain text.

theverge EN 2024 OpenAI chatgpt macOS app plain-text
ChatGPT-4, Mistral, other AI chatbots spread Russian propaganda https://www.axios.com/2024/06/18/ai-chatbots-russian-propaganda
19/06/2024 19:45:48
QRCode
archive.org

A NewsGuard audit found that chatbots spewed misinformation from American fugitive John Mark Dougan.
#AI #Axios #ChatGPT #Google #Illustrations #License #Microsoft #Misinformation #OpenAI #Visuals #genAI #generative #or

Google Illustrations OpenAI or Misinformation AI Axios Visuals Microsoft License genAI generative ChatGPT
Hacker Releases Jailbroken "Godmode" Version of ChatGPT https://futurism.com/hackers-jailbroken-chatgpt-godmode
01/06/2024 10:41:17
QRCode
archive.org
thumbnail

A hacker has released a jailbroken version of ChatGPT called "GODMODE GPT."

Earlier today, a self-avowed white hat operator and AI red teamer who goes by the name Pliny the Prompter took to X-formerly-Twitter to announce the creation of the jailbroken chatbot, proudly declaring that GPT-4o, OpenAI's latest large language model, is now free from its guardrail shackles.

futurism EN 2024 chatgpt jailbroken GODMODE
OpenAI finds Russian, Chinese propaganda campaigns used its tech https://www.washingtonpost.com/technology/2024/05/30/openai-disinfo-influence-operations-china-russia/?pwapi_token=eyJ0eXAiOiJKV1QiLCJhbGciOiJIUzI1NiJ9.eyJyZWFzb24iOiJnaWZ0IiwibmJmIjoxNzE3MDQxNjAwLCJpc3MiOiJzdWJzY3JpcHRpb25zIiwiZXhwIjoxNzE4NDIzOTk5LCJpYXQiOjE3MTcwNDE2MDAsImp0aSI6IjZmZmEwZWIxLWJiZDItNDBmMi05ZTQ1LWZjYTI3N2U5ODE0MyIsInVybCI6Imh0dHBzOi8vd3d3Lndhc2hpbmd0b25wb3N0LmNvbS90ZWNobm9sb2d5LzIwMjQvMDUvMzAvb3BlbmFpLWRpc2luZm8taW5mbHVlbmNlLW9wZXJhdGlvbnMtY2hpbmEtcnVzc2lhLyJ9.lZy8-t9Wf1mDTHueMt7j0kCTV8XAifSEbK8hmsBd3bk
31/05/2024 08:02:03
QRCode
archive.org
thumbnail

Covert propagandists have already begun using generative artificial intelligence to boost their influence operations.

washingtonpost EN 2024 OpenAI chatgpt China Russia propaganda
Security Brief: TA547 Targets German Organizations with Rhadamanthys Stealer https://www.proofpoint.com/us/blog/threat-insight/security-brief-ta547-targets-german-organizations-rhadamanthys-stealer
17/04/2024 11:57:54
QRCode
archive.org
thumbnail

What happened  Proofpoint identified TA547 targeting German organizations with an email campaign delivering Rhadamanthys malware. This is the first time researchers observed TA547 use Rhadamanthys,...

proofpoint EN 2024 LLM chatgpt analysis TA547 Rhadamanthys Stealer
OpenAI's chatbot store is filling up with spam https://techcrunch.com/2024/03/20/openais-chatbot-store-is-filling-up-with-spam/?guccounter=1
21/03/2024 17:26:19
QRCode
archive.org
thumbnail

When OpenAI CEO Sam Altman announced GPTs, custom chatbots powered by OpenAI's generative AI models, onstage at the company's first-ever developer

techcrunch EN 2024 ai apps chatbots chatgpt gpt-store gpts openai copyright leagal spam
Salt Labs research finds security flaws within ChatGPT Ecosystem (Remediated) https://salt.security/blog/security-flaws-within-chatgpt-extensions-allowed-access-to-accounts-on-third-party-websites-and-sensitive-data
14/03/2024 11:00:20
QRCode
archive.org
thumbnail

Salt Labs researchers identified generative AI ecosystems as a new interesting attack vector. vulnerabilities found during this research on ChatGPT ecosystem could have granted access to accounts of users, including GitHub repositories, including 0-click attacks.

salt.security EN 2024 ChatGPT flaws plugins
Researchers found multiple flaws in ChatGPT plugins https://securityaffairs.com/160447/hacking/chatgpt-plugins-vulnerabilities.html
14/03/2024 10:57:09
QRCode
archive.org
thumbnail

Researchers from Salt Security discovered three types of vulnerabilities in ChatGPT plugins that can be could have led to data exposure and account takeovers.

ChatGPT plugins are additional tools or extensions that can be integrated with ChatGPT to extend its functionalities or enhance specific aspects of the user experience. These plugins may include new natural language processing features, search capabilities, integrations with other services or platforms, text analysis tools, and more. Essentially, plugins allow users to customize and tailor the ChatGPT experience to their specific needs.

securityaffairs EN 2024 flows ChatGPT plugins researchers
ChatGPT «devient fou», OpenAI s’explique https://www.ictjournal.ch/news/2024-02-22/chatgpt-devient-fou-openai-sexplique
22/02/2024 22:00:25
QRCode
archive.org
thumbnail

Durant plusieurs heures, ChatGPT a présenté un comportement inattendu, générant des réponses illogiques et des créa

ictjournal FR CH 2024 ChatGPT inattendu illogiques bug
Disrupting malicious uses of AI by state-affiliated threat actors https://openai.com/blog/disrupting-malicious-uses-of-ai-by-state-affiliated-threat-actors
15/02/2024 14:16:51
QRCode
archive.org
thumbnail

We terminated accounts associated with state-affiliated threat actors. Our findings show our models offer only limited, incremental capabilities for malicious cybersecurity tasks.

openai EN 2024 malicious AI chatGPT
NSA official: hackers use AI bots like ChatGPT to perfect English https://www.nbcnews.com/tech/security/nsa-hacker-ai-bot-chat-chatgpt-bard-english-google-openai-rcna133086
10/01/2024 08:57:00
QRCode
archive.org
thumbnail

NSA Cybersecurity Director Rob Joyce said the spy agency has seen hackers use chatbots like ChatGPT to perfect their English for phishing schemes.

nbcnews EN 2024 NSA RobJoyce ChatGPT phishing AI
ChatGPT-aided ransomware in China results in four arrests as AI raises cybersecurity concerns | South China Morning Post https://www.scmp.com/tech/tech-trends/article/3246612/chatgpt-aided-ransomware-china-results-four-arrests-ai-raises-cybersecurity-concerns
31/12/2023 11:11:17
QRCode
archive.org
thumbnail
scmp EN 2023 ChatGPT ChatGPT-Based ransomware China
Things are about to get a lot worse for Generative AI https://garymarcus.substack.com/p/things-are-about-to-get-a-lot-worse
30/12/2023 14:11:08
QRCode
archive.org

A full of spectrum of infringment

The cat is out of the bag:

  • Generative AI systems like DALL-E and ChatGPT have been trained on copyrighted materials;

  • OpenAI, despite its name, has not been transparent about what it has been trained on.

  • Generative AI systems are fully capable of producing materials that infringe on copyright.

  • They do not inform users when they do so.

  • They do not provide any information about the provenance of any of the images they produce.

  • Users may not know when they produce any given image whether they are infringing.

garymarcus EN 2023 DALL-E ChatGPT Copyright infringment AI legal
page 1 / 3
4560 links
Shaarli - The personal, minimalist, super-fast, database free, bookmarking service par la communauté Shaarli - Theme by kalvn - Curated by Decio