Alibaba Cloud Upgrades Core AI Infrastructure

Advertisements

  • March 1, 2025

The much-anticipated annual convention centered on artificial intelligence and cloud services, known as the Yunqi Conference, has officially kicked offA flagship event for Alibaba Cloud to showcase its technological prowess, this year's gathering has attracted significant attention.

On September 19, 2024, despite the ominous clouds and torrential rain brought about by the switching of two typhoons, the conference proceeded as scheduledEstablished in 2009 as the inaugural “China Website Development Forum,” it has undergone a transformation over the years, rebranding to its current title in 2015. Now, a decade later, the Yunqi Conference stands as a premier venue for sharing and advancing ideas in cloud computing and artificial intelligence.

What set this year's conference apart from previous iterations was the introduction of three distinct thematic exhibition zones focusing on artificial intelligence+ innovation, computing advancements, and cutting-edge applicationsThese segments showcased groundbreaking models, advanced computational power, and innovative applications, presenting an easily digestible feast of technology for enthusiasts and professionals linked to AI and cloud computing.

Among the standout presentations, He Xiaopeng, founder of XPeng Motors, unveiled the world’s first fully autonomous vehicle driven entirely by AIHe highlighted how large model technology significantly enhances the vehicle’s automatic driving capabilities, enabling smoother steering and lane changes compared to human driversHe further predicted that the proficiency of autonomous tech will reach a level comparable to seasoned drivers within the next 36 months.

In addition, several domestic robotics companies showcased their latest innovations and industry trendsJinshi Robotics introduced an independent high-power distributed control system and spatial logistics robots, overcoming significant challenges in unmanned logistics yards

Advertisements

Yushu Technology displayed its Unitree H1 quadruped robot, known for its superior adaptability, stability, and intelligence, while Zhujidian's CL-2 robot was revealed as the first humanoid robot capable of real-time terrain perception and dynamic stair climbing.

At the heart of the conference remained Alibaba Cloud's profound insights into AI technology.

Wu Yongming, CEO of Alibaba Group and Chairman of Alibaba Cloud Intelligence, delivered a keynote address where he discussed how generative AI would reconstruct both the digital and physical worlds, igniting a fundamental revolution in computing architectureThe traditional computing paradigm, which has centered around CPUs for decades, is swiftly transitioning towards one dominated by GPUs, aligning with AI computational frameworksWu envisioned a future where nearly all hardware and software would possess inference capabilities, evolving into a computing kernel model that predominantly utilizes GPU AI power with traditional CPU computations serving as supplementary.

As lively as the technological showcases appeared, there remains a profound transformation happening within Alibaba Cloud that is poised to reshape the broader AI landscape in China.

The core discussions of significance transpired during the afternoon main forum.

Alibaba Cloud CTO Zhou Jingren summarized the strategic and technological advancements made in the AI domain over the past year, hinting at a significant pivot for the future of Alibaba Cloud.

His proclamation, committing to an all-out investment in foundational AI infrastructure, hints at a deep transformation within Alibaba Cloud focused on tackling multifaceted challenges encountered in AI applications.

Above all, the top concern among model developers and users is how to effectively overcome the challenges and risks posed by heterogeneous computing

Advertisements

The myriad of chips from diverse computational power manufacturers hinders the simple synchronization of resources for simultaneous model training.

For instance, it is infeasible to blend AMD or Chinese GPUs with NVIDIA’s A100 in training a single model, resulting in substantial computational resource wastage for many enterprisesA key component of Alibaba Cloud's architectural upgrade is to address these heterogeneous computing challenges.

The introduction of Alibaba Cloud’s new Panjiu AI server provides 16 cards with 1.5 TB VRAM and features an AI algorithm that predicts GPU failures with 92% accuracyMost importantly, this server enables simultaneous utilization of products from different manufacturers to fulfill the training tasks of the same modelThrough internal development and system alignment, these challenges imposed by heterogeneous computing have been effectively resolved.

Moreover, Alibaba Cloud unveiled GPU container computing leveraging topology-aware scheduling that enhances computational affinity and performanceThis innovation provides a suitable, fault-tolerant solution for various model training processes, enabling businesses to reduce investments while driving successful model training.

Another crucial aspect of model development is the speed of data transmission, which can directly impact training resultsInevitably, delays in data transfer and processing can lead to interruptions in the model training process, extending the timeline for deploying applications.

In this respect, Alibaba Cloud launched its high-performance network architecture HPN 7.0, capable of rapidly transmitting large volumes of data and establishing stable connections to over 100,000 GPUs, enhancing end-to-end training performance by over 10%.

Additionally, given the diverse nature of required data during model training, a multitude of formats is necessitated, ranging from various videos to small musical clips

Advertisements

The need to quickly facilitate model storage and retrieval across different formats emerges as a critical requirement for any AI foundational platform.

To this end, Alibaba Cloud introduced CPFS file storage, boasting a data throughput of 20 TB/s, affording seamless storage and extraction across multiple formats—supporting exponential expansion of AI computing capacity.

Moreover, model training frequently encounters a variety of errorsAny delays in data processing or indecision in selecting training methodologies can result in mismatches that lead to training failuresOnce training errors arise, projects must regress, necessitating a complete restart for subsequent training sessionsThis not only squanders time for model implementers but also inflates their training costs.

To address these issues, Alibaba Cloud built the PAI platform, which facilitates model training, deployment, and research through analysis of various factorsThis platform enhances training fault tolerance, leveraging AI to identify the sources of errors and how they can be rectified or supplemented, aiding users in successfully navigating the training processes.

Today, Alibaba Cloud's AI platform PAI accommodates elastic scheduling of training inference at a level exceeding 10,000 units, with an effective utilization rate of over 90% for AI computational power.

Following a series of upgrades to its Tongyi Qianwen model's capabilities, which were also released at a lowered cost, Alibaba Cloud illustrated through these foundational technological innovations that its strategic pivot is decidedly toward AI.

In other terms, Alibaba Cloud was in the process of creating a new AI infrastructure centered on the needs of AI development.

This repositioning could potentially transform the current state of AI and internet development within China.

During this year’s Yunqi Conference, another significant trend emerged: the growing advantages of multimodal capabilities in China.

On one hand, whether it is Alibaba Cloud’s Tongyi Qianwen, Zhipu Qinyuan, or Kimi, these leading Chinese models displayed remarkable multimodal capabilities throughout the event.

Zhou Jingren announced that Tongyi Wanxiang underwent a complete upgrade, introducing a new video generation model capable of creating high-definition videos suitable for applications in film, animation, and advertising

Starting now, all users can experience the functionality free of charge via the Tongyi App and its official website.

The first functions introduced by Tongyi Wanxiang involve text-to-video and image-to-video capabilitiesWith the text-to-video feature, users can input any keywords to generate a segment of high-definition video, with support for both Chinese and English languages, along with an inspiration-expansion feature to enhance the video content's expressiveness; notably, it can generate in various aspect ratios like 16:9 and 9:16.

In the image-to-video segment, Tongyi Wanxiang allows users to convert any image into a dynamic video according to predefined or uploaded aspect ratios, with user prompts governing video motion.

In fact, compared to OpenAI’s Sora, heralded earlier this year as the leading model for text-to-video creation, all the praised functionalities now find their equivalent realizations in Alibaba Cloud’s Tongyi Wanxiang.

Moreover, similar abilities are now characteristic of numerous domestic models, showcasing equally proficient performances.

In contrast, OpenAI, while first proposing Sora, still sees this so-called model linger within the confines of research reports, without any rollout of a commercially viable product.

In the realm of musical creation, the developments are even more compelling.

During this Yunqi Conference, Zhipu Qinyuan introduced a model with the enhanced capability of generating music from textBy inputting personal insights and musical preferences, related music can be quickly created.

Unlike mainstream Western text-to-music models that have emerged, the core competency of the Chinese large models in this domain lies in effectively comprehending user requirements, thereby facilitating music experiences satisfying these needs with minimal prompt inputs.

International applications such as SONU, while popular, tend to be overly intricate, with detailed guideline volumes

This complicates the user interaction barrier and hinders broader adoption of text-to-music models.

This aspect significantly contributes to the exceptional success of several Chinese text-to-video and text-to-music models in international markets.

Regarding the broader framework, a notable observation from the Yunqi Conference is the contrasting development philosophies of AI in China compared to the United States.

Whether it's Alibaba Cloud’s pivot in creating foundational AI infrastructure or the advances in Tongyi Qianwen and Tongyi Wanxiang models, China's systematic approach to model training, specifically in mismatches with U.S. internet giants, has started taking a distinctly different route.

At last month’s ISC 2024 event, Chinese Academy of Sciences academician Zhang Bo noted that American AI research institutes are overly focused on investments in computation power, while Chinese enterprises are compelled to refine knowledge frameworks and algorithms due to challenging access to computational resourcesThis has spurred improvements in model training effects per computational unit.

This shift has allowed for a better grasp of internal model variationsCoupled with the vast data breadth within the Chinese internet ecosystem, especially in multimodal areas, there now exists a notable superiority over the U.S. internet landscape.

The challenges faced by OpenAI, particularly in video and music multimodalities, are being addressed with remarkable efficacy by Chinese model developers.

China's clear trajectory for future model development seems firmly aimed at industrial integration and practical applicationThis contrasts with OpenAI’s latest O1 model aiming to create a “digital deity” with omnipotent knowledge to resolve diverse American issues.

In comparison, Chinese strategies towards large model development are explicit and realistic.

During the Yunqi Conference, the diverse array of robotic entities showcased relates closely to this context

Advertisements

Advertisements

Comments (43 Comments)

Leave A Comment