South China Morning Post scmp.com Published: 5:00pm, 12 Aug 2025 - Chinese tech firms are leveraging software improvements to compensate for limited access to advanced hardware.
Huawei Technologies has unveiled a software tool designed to accelerate inference in large artificial intelligence models, an advancement that could help China reduce its reliance on expensive high-bandwidth memory (HBM) chips.
Unified Cache Manager (UCM) is an algorithm that allocates data according to varying latency requirements across different types of memories – including ultra-fast HBM, standard dynamic random access memory and solid-state drive – thereby enhancing inference efficiency, according to Huawei executives at the Financial AI Reasoning Application Landing and Development Forum in Shanghai on Tuesday.
Zhou Yuefeng, vice-president and head of Huawei’s data storage product line, said UCM demonstrated its effectiveness during tests, reducing inference latency by up to 90 per cent and increasing system throughput as much as 22-fold.
The move exemplifies how Chinese tech firms are leveraging software improvements to compensate for limited access to advanced hardware. Earlier this year, Chinese start-up DeepSeek captured global attention by developing powerful AI models with constrained chip resources.
Huawei plans to open-source UCM in September, first in its online developer community and later to the broader industry. The initiative could help China lessen its dependence on foreign-made HBM chips, a market mostly controlled by South Korea’s SK Hynix and Samsung Electronics, as well as the US supplier Micron Technology.
HBM is a stacked, high-speed, low-latency memory that provides substantial data throughput to AI chips, enabling optimal performance. The global HBM market is projected to nearly double in revenue this year, reaching US$34 billion, and is expected to hit US$98 billion by 2030, largely driven by the AI boom, according to consulting firm Yole Group.
The Wiz Research team has discovered a chain of critical vulnerabilities in NVIDIA's Triton Inference Server, a popular open-source platform for running AI models at scale. When chained together, these flaws can potentially allow a remote, unauthenticated attacker to gain complete control of the server, achieving remote code execution (RCE).
This attack path originates in the server's Python backend and starts with a minor information leak that cleverly escalates into a full system compromise. This poses a critical risk to organizations using Triton for AI/ML, as a successful attack could lead to the theft of valuable AI models, exposure of sensitive data, manipulating the AI model's responses and a foothold for attackers to move deeper into a network.
Wiz Research responsibly disclosed these findings to NVIDIA, and a patch has been released. We would like to thank the NVIDIA security team for their excellent collaboration and swift response. NVIDIA has assigned the following identifiers to this vulnerability chain: CVE-2025-23319, CVE-2025-23320, and CVE-2025-23334. We strongly recommend all Triton Inference Server users update to the latest version. This post provides a high-level overview of these new vulnerabilities and their potential impact.
The enclosed work is the latest in a series of NVIDIA vulnerabilities we’ve disclosed, including two container escapes: CVE-2025-23266 and CVE 2024-0132.
Mitigations
Update Immediately: The primary mitigation is to upgrade both the NVIDIA Triton Inference Server and the Python backend to version 25.07 as advised in the NVIDIA security bulletin. 
Wiz customers can use the following to detect vulnerable instances in their cloud environment:
Wiz customers can use the Vulnerability Findings page to find all instances of these vulnerabilities in their environment, or filter results to instances related to critical issues. Alternatively, you can use the Security Graph to identify publicly exposed vulnerable VMs/serverless or containers.
Wiz Advanced customers can filter on findings identified or validated by the Dynamic Scanner, and Wiz Sensor customers can filter on runtime validated findings.
Wiz Code customers can filter on findings with one-click remediation to generate a pull request with fixes for vulnerable instances detected in code repositories.
france24.com - Chinese authorities summoned Nvidia representatives on Thursday to discuss "serious security issues" over some of its artificial intelligence chips, as the US tech giant finds itself entangled in trade tensions between Beijing and Washington.
Nvidia is a world-leading producer of AI semiconductors, but the United States effectively restricts which chips it can export to China on national security grounds.
A key issue has been Chinese access to the "H20", a less powerful version of Nvidia's AI processing units that the company developed specifically for export to China.
The California-based firm said this month it would resume H20 sales to China after Washington pledged to remove licensing curbs that had halted exports.
But the firm still faces obstacles -- US lawmakers have proposed plans to require Nvidia and other manufacturers of advanced AI chips to include built-in location tracking capabilities.
And Beijing's top internet regulator said Thursday it had summoned Nvidia representatives to discuss recently discovered "serious security issues" involving the H20.
The Cyberspace Administration of China said it had asked Nvidia to "explain the security risks of vulnerabilities and backdoors in its H20 chips sold to China and submit relevant supporting materials".
The statement posted on social media noted that, according to US experts, location tracking and remote shutdown technologies for Nvidia chips "are already matured".
The announcement marked the latest complication for Nvidia in selling its advanced products in the key Chinese market, where it is in increasingly fierce competition with homegrown technology firms.
Nvidia committed
CEO Jensen Huang said during a closely watched visit to Beijing this month that his firm remained committed to serving local customers.
Huang said he had been assured during talks with top Chinese officials during the trip that the country was "open and stable".
"They want to know that Nvidia continues to invest here, that we are still doing our best to serve the market here," he said.
Nvidia this month became the first company to hit $4 trillion in market value -- a new milestone in Wall Street's bet that AI will transform the global economy.
Jost Wubbeke of the Sinolytics consultancy told AFP the move by China to summon Nvidia was "not surprising in the sense that targeting individual US companies has become a common tool in the context of US-China tensions".
"What is surprising, however, is the timing," he noted, after the two countries agreed to further talks to extend their trade truce.
"China's action may signal a shift toward a more assertive stance," Wubbeke said.
Beijing is also aiming to reduce reliance on foreign tech by promoting Huawei's domestically developed 910C chip as an alternative to the H20, he added.
"From that perspective, the US decision to allow renewed exports of the H20 to China could be seen as counterproductive, as it might tempt Chinese hyperscalers to revert to the H20, potentially undermining momentum behind the 910C and other domestic alternatives."
New hurdles to Nvidia's operation in China come as the country's economy wavers, beset by a years-long property sector crisis and heightened trade headwinds under US President Donald Trump.
Chinese President Xi Jinping has called for the country to enhance self-reliance in certain areas deemed vital for national security -- including AI and semiconductors -- as tensions with Washington mount.
The country's firms have made great strides in recent years, with Huang praising their "super-fast" innovation during his visit to Beijing this month.
www.scmp.com - Heightened US chip export controls have prompted Chinese AI and chip companies to collaborate.
Chinese chipmaker Sophgo has adapted its compute card to power DeepSeek’s reasoning model, underscoring growing efforts by local firms to develop home-grown artificial intelligence (AI) infrastructure and reduce dependence on foreign chips amid tightening US export controls.
Sophgo’s SC11 FP300 compute card successfully passed verification, showing stable and effective performance in executing the reasoning tasks of DeepSeek’s R1 model in tests conducted by the China Telecommunication Technology Labs (CTTL), the company said in a statement on Monday.
A compute card is a compact module that integrates a processor, memory and other essential components needed for computing tasks, often used in applications like AI.
CTTL is a research laboratory under the China Academy of Information and Communications Technology, an organisation affiliated with the Ministry of Industry and Information Technology.
During the second day of Pwn2Own Berlin 2025, competitors earned $435,000 after exploiting zero-day bugs in multiple products, including Microsoft SharePoint, VMware ESXi, Oracle VirtualBox, Red Hat Enterprise Linux, and Mozilla Firefox.
The highlight was a successful attempt from Nguyen Hoang Thach of STARLabs SG against the VMware ESXi, which earned him $150,000 for an integer overflow exploit.
Dinh Ho Anh Khoa of Viettel Cyber Security was awarded $100,000 for hacking Microsoft SharePoint by leveraging an exploit chain combining an auth bypass and an insecure deserialization flaw.
Palo Alto Networks' Edouard Bochin and Tao Yan also demoed an out-of-bounds write zero-day in Mozilla Firefox, while Gerrard Tai of STAR Labs SG escalated privileges to root on Red Hat Enterprise Linux using a use-after-free bug, and Viettel Cyber Security used another out-of-bounds write for an Oracle VirtualBox guest-to-host escape.
In the AI category, Wiz Research security researchers used a use-after-free zero-day to exploit Redis and Qrious Secure chained four security flaws to hack Nvidia's Triton Inference Server.
On the first day, competitors were awarded $260,000 after successfully exploiting zero-day vulnerabilities in Windows 11, Red Hat Linux, and Oracle VirtualBox, reaching a total of $695,000 earned over the first two days of the contest after demonstrating 20 unique 0-days.
The Pwn2Own Berlin 2025 hacking competition focuses on enterprise technologies, introduces an AI category for the first time, and takes place during the OffensiveCon conference between May 15 and May 17.
Making Software I am a programmer by nature. I now had root access to a cool new linux box so now I must develop software for it. The Goal While looking through many of the IVI’s files, I found tons of really cool C++ header files relating to ccOS in /usr/include. ccOS is the Connected Car Operating System, an OS developed by Nvidia and Hyundai which is supposed to power all Hyundai vehicles from 2022 onwards, but I guess some of the underlying system was in previous Hyundai vehicles for quite some time.