
Intelligent Computing Technical Insights | Why Do We Need More Open and Decoupled Intelligent Computing Center Networks Part 1
Traditional Data Centers VS Intelligent Computing Centers
Prior to the emergence of AI technology, users generally utilized a simpler and more direct architecture when constructing data centers, emphasizing connectivity. Over time, virtualization technology emerged, paving the way for cloud computing. However, regardless of its evolution, traditional data centers have always relied on CPUs for serial computing, ultimately providing users with deterministic computational results.
With the rapid development of AI technology, GPUs have become an irreplaceable core component in intelligent computing scenarios. The key difference between CPU and GPU lies in their data processing and inference capabilities. CPUs derive precise conclusions based on predefined rules and deterministic data, while GPUs process massive amounts of raw data, using intelligent training and inference to provide users with uncertain predictions—a process that users cannot fully control.
In this context, the network of an intelligent computing center must handle the transmission of vast amounts of data and frequent internal data interactions. Therefore, compared to traditional data centers, intelligent computing networks face greater challenges. These challenges not only involve ensuring network stability during high-speed computations and interactions but also require careful planning before construction to maximize the value of the investment.
Why the Openness and Decoupling is Necessary?
Usually, when building an intelligent computing network, the focus tends to be on GPUs and supporting hardware. However, due to the limited choices of GPUs available in the market, builders tend to opt for well-known manufacturers. These manufacturers offer comprehensive product ecosystems that cover almost every aspect of an intelligent computing center, including switches, specialized optical modules, GPUs, servers—and provide integrated solutions based on these products.
This creates a misconception: many users assume that an intelligent computing center must be a single vendor or an all-in-one solution. However, network and computing infrastructure can be procured separately. Just like building a general-purpose data center, users can purchase servers from one vendor and switches from another, selecting the most advanced products in each category to maximize value.
The Value of Decoupling
1. Leveraging Leading Innovations Across Fields
First, computing and networking are highly complex domains, involving components such as GPUs, network NICs, optical modules, and switches—each with dozens or even hundreds of manufacturers forming a vast ecosystem. By adopting a decoupled approach, customers can combine cutting-edge AI computing platforms with high-quality network connectivity, resulting in a superior overall solution for intelligent computing centers. Additionally, introducing more suppliers prevents vendor lock-in and preserves bargaining power, thereby reducing procurement costs.
2. Flexibility and Scalability
Choosing an open architecture when building an intelligent computing network lays a flexible foundation for future development.
Take Ethernet as an example: its ability to integrate with all intelligent computing platforms, combined with its open-standard nature, allows for phased construction of intelligent computing center networks. This ensures seamless interoperability with existing infrastructure while enabling flexible expansion and upgrades to meet new business demands. As technology evolves, this architecture can adapt to future needs, whether by switching vendors or upgrading hardware like CPUs and GPUs, ensuring smooth scalability.
Thus, an open and decoupled intelligent computing network is crucial for the construction of intelligent computing centers and a key driver for the continued advancement of intelligent computing technology.
Deployability
When transitioning from traditional data centers to intelligent computing centers, builders often fall into another misconception: they assume that simply purchasing some hardware boxes, pairing them with a fixed-architecture framework, and connecting servers to the leaf layer is sufficient. However, intelligent computing centers differ fundamentally from general-purpose data centers. They not only require high-performance hardware but also demand careful consideration of future-proof architecture design.
Take hardware equipment as an example, intelligent computing centers often incorporate cutting-edge hardware such as high-performance GPUs, high-speed optical modules, and switches supporting 200G/400G ports. The complexity of technology selection increases significantly, leading some users to prioritize performance while overlooking the deployability of the intelligent computing center itself.
Numerous experience and cases demonstrate that merely stacking equipment does not create an efficient intelligent computing network. Instead, it requires in-depth planning, design, and implementation.
1. Network Scale and Equipment Selection
Given the unique traffic patterns, intelligent computing networks typically adopt a 1:1 non-blocking fat-tree architecture, which differs markedly from the higher-blocking architectures used in traditional data centers. The core characteristic of this architecture is that its capacity is closely tied to the port count of individual devices—the more ports a device has, the more exponentially the intelligent computing network can scale. A widely adopted approach is using uniformly configured box-shaped devices (e.g., 64-port 400G products) to build two- or three-tier network structures for large-scale deployments. In such three-tier architectures, Core, Spine, and Leaf layers form multiple PODs, with Spine and Leaf serving as internal nodes and Core handling inter-POD connectivity. According to the fat-tree scaling model, the capacity of a single POD is K²/4, while the total capacity of all PODs is K³/4 (where K represents the number of device ports).
During the initial planning phase, intelligent computing networks must account for future scaling needs and select network equipment accordingly to ensure the architecture remains flexible and upgradable, adapting to evolving business demands and technological innovations.
2. Installation Environment and Power Consumption
Following the principle that port count correlates with scalability, chassis switches—capable of supporting hundreds of ports—offer better scalability than box switches at the same cost. However, power consumption is a critical concern. When a chassis switch is fully configured with 400G ports, its peak power draw can reach up to 20 kilowatts—far exceeding the capacity of traditional data center racks designed for a maximum of 10 kilowatts per cabinet. In contrast, even high-performance servers equipped with eight GPUs typically consume less than 10 kilowatts. If large-scale power infrastructure upgrades are required to accommodate network equipment, the cost advantage vanishes.
On the other hand, while an all-box architecture can theoretically support larger intelligent computing networks, not all customers require such massive deployments (e.g., enterprise-built intelligent computing centers are often smaller than those of internet companies). For smaller-scale projects, a single-chassis or multi-chassis one-tier architecture may be more cost-effective and efficient, meeting current business needs without unnecessary energy waste or costly retrofits.
Therefore, selecting the appropriate switch type (chassis or box) based on actual network scale is crucial. When designing the network, a balance must be struck between scalability and power constraints.
3. Optical Module Interoperability
In some intelligent computing center layouts, a single rack may house only one GPU server. As a result, the distance between switches and connected servers can vary significantly. To address this, multiple types of optical modules may be needed to accommodate different cabling requirements. Additionally, since 400G ports may be split for use and interoperability issues may arise between QSFP-112 and QSFP-DD standards, these complexities must be carefully considered during the planning phase.
For decisions regarding scale and architecture alignment, customers can seek assistance from professional network vendors for detailed planning. The key lies in selecting equipment with diverse form factors and support for open protocols, enabling flexible network architectures that ensure both openness and high deployability.
Take H3C’s switch products as an example. The diversity in form factors, protocol openness, and architectural flexibility have set it as industry benchmarks, demonstrating exceptional adaptability in demanding intelligent computing scenarios. H3C offers a comprehensive portfolio of box and chassis products supporting rates from 100G, 200G, 400G, to 800G, as well as innovative Diversified Dynamic-connectivity architecture solutions, catering to intelligent computing centers of all scales and deployment environments.
H3C adheres to the philosophy of an open ecosystem and collaborative development, integrating the strengths of major switch chip manufacturers and leveraging standardized RoCE protocols to deliver lossless network solutions. Furthermore, its products provide standard Netconf interfaces for seamless integration with third-party management systems such as SDN controllers and cloud platforms, maximizing application scenarios and customer compatibility. In terms of hardware compatibility, H3C has pre-validated end-to-end connectivity with mainstream GPUs, network NICs, and optical modules, ensuring customers can deploy with confidence, free from compatibility concerns.