🇰🇷 South Korea /Technology

Huawei boosts AI response speed with new inference acceleration solution

From Dong-A Ilbo · Jun 30, 2026 (Jun 30) Korean

Translated from Korean, summarized and contextualized by DistantNews.

At a glance

News Sources not specified New plan

Huawei has verified its AI Inference Acceleration Solution in a commercial network, marking a first for China's telecom industry.
The solution, using Huawei's storage and AI computing products, significantly improves response times for generative AI models.
This development addresses the growing challenge of slower AI responses with longer text inputs, crucial for AI agents and large language models.

Huawei has successfully verified its AI Inference Acceleration Solution within a commercial network environment, a significant milestone for China's telecommunications sector. This solution is designed to enhance the response speed of generative AI, addressing a key challenge as AI services become more complex and handle longer contexts.

The verification, conducted in collaboration with China Mobile Hubei at the MWC Shanghai 2026, utilized Huawei's OceanStor A800 storage, Ascend A3 SuperPoD, and Unified Cache Manager (UCM). The company expects this technology to enable telecom operators to manage AI computing services more efficiently. As generative AI evolves to process extensive context for tasks like AI agents, code generation, and multi-turn conversations, the need for faster inference performance has become critical.

During tests using MiniMax M2.5 and GLM-5.1 models on China Mobile Hubei's commercial network, Huawei reported substantial improvements. The time to first token (TTFT) was reduced by up to 93%, and the tokens per second (TPS) throughput increased by as much as 372%. Notably, the performance gains were more pronounced when processing longer input data.

Major telecom operators are launching token-based AI services one after another, and the large-scale adoption of AI agents is entering a new phase. Token usage is expected to increase exponentially.

— Michael ChuMichael Chu, Global President of Huawei's Data Storage Marketing & Solution Sales, commented on the growing demand for AI services and the role of Huawei's solution.

Michael Chu, Global President of Huawei's Data Storage Marketing & Solution Sales, stated that the large-scale adoption of AI agents is entering a new phase with major telecom operators launching token-based AI services. He anticipates a geometric increase in token usage. Huawei's solution aims to shorten TTFT and reduce token processing costs, supporting telecom companies in building more efficient and environmentally friendly AI computing infrastructure.

The industry is witnessing intensified competition in inference infrastructure as AI agents and enterprise-level generative AI services expand. The inherent increase in computational costs and power consumption with user growth and longer contexts drives global investment in inference optimization and AI infrastructure upgrades to boost response speed and lower operational expenses.

The AI inference acceleration solution significantly shortens the time to first token generation and contributes to reducing token processing costs, helping telecom operators build more efficient and eco-friendly AI computing infrastructure.

— Michael ChuMichael Chu highlighted the benefits of Huawei's AI solution for telecom operators.

DistantNews Editorial

Originally published by Dong-A Ilbo in Korean. Translated, summarized, and contextualized by our editorial team with added local perspective. Read our editorial standards.

Huawei boosts AI response speed with new inference acceleration solution

At a glance

AI is a key driver of U.S. economic growth, contributing 39% to expansion

Dahami Communications' Digital Newspaper 'T-Paper' Achieves Barrier-Free (BF) Certification, Leading…

AI Benefits Must Be Shared to Avoid Resistance, Experts Say

South Korea eases financial network security rules to embrace AI

What Gets Lost (and Found) When News Is Translated

The Critical Role of Diaspora Media in Global News

What Travellers Should Know About Countries with Restricted Press

A Digital Nomad's Guide to Following Local News

How the Same Story Looks Different in Different Countries

Essential News Sources to Check Before Traveling to a New Country

Huawei boosts AI response speed with new inference acceleration solution

At a glance

More Stories

AI is a key driver of U.S. economic growth, contributing 39% to expansion

Dahami Communications' Digital Newspaper 'T-Paper' Achieves Barrier-Free (BF) Certification, Leading…

AI Benefits Must Be Shared to Avoid Resistance, Experts Say

South Korea eases financial network security rules to embrace AI

From Our Blog

What Gets Lost (and Found) When News Is Translated

The Critical Role of Diaspora Media in Global News

What Travellers Should Know About Countries with Restricted Press

A Digital Nomad's Guide to Following Local News

How the Same Story Looks Different in Different Countries

Essential News Sources to Check Before Traveling to a New Country