Spectrum-X Networking Platform Administration
Private Instructor-led Remote training
Course Duration 3 sessions of 4 hours each Hours
Our hands-on training course explores the architecture, deployment, configuration, operation, and management of NVIDIA Spectrum-X networking platforms for AI factories.
Participants will gain practical experience provisioning and monitoring AI clusters using Spectrum-X, NetQ, and Cumulus Linux through instructor-led sessions and labs in the NVIDIA Air environment.
• Explain the fundamentals of NVIDIA Spectrum-X Networking Platform, including its
architecture, key components, and reference design for AI environments.
• Gain hands-on experience with NVIDIA Air environment for simulating and testing
Spectrum-X deployments.
• Deploy the Spectrum-X platform, including IP addressing, QoS configurations, routing
policies, and virtualized network setup for multi-tenancy.
• Apply advanced networking concepts such as RoCE , Adaptive Routing, and
Congestion Control in the context of AI workloads.
• Monitor and troubleshoot Spectrum-X fabric using NVIDIA NetQ and Cumulus Linux
CLI.
Day 1
Introduction to Spectrum-X Networking Platform
• Unit 1 - Spectrum-X Networking Platform Overview
• Unit 2 - Architecture Overview
• Unit 3 - Reference Architecture
• Unit 4 - NVIDIA Digital Twins with Air environment
• Practice 1 – Accessing the Air environment
Day 2
Spectrum-X Platform Deployment
• Unit 5 - Deployment Guide
o IP Addressing Overviewo QoS: RoCE, Adaptive Routing and Congestion Control
o Routing Policies
o Underlay Network
o Virtualized Network and Multitenancy
• Practice 2: Deploying the Spectrum-X Platform
Day 3:
Monitoring and Troubleshooting
• Unit 6 – Spectrum-X Fabric Telemetry with NetQ
o NetQ features
o Installing and configuring the NetQ agent
o Validation checks for network health
o Fabric Monitoring Methods:
▪ ASIC monitoring tools
▪ OTLP (Open Telemetry)
▪ DTS – DOCA Telemetry Service
• Practice 3: Managing fabric telemetry with NetQ
• Practice 4: Troubleshooting Spectrum-X platform deployment
Refer to the course outline or Learning Objectives for more details.
The course is designed for network administrators, DevOps professionals, and IT-related
roles who want to gain the knowledge and skills necessary to deploy and maintain
Spectrum-X networking platform-based AI data centers.
• Knowledge of networking concepts and principles, including technologies used in
data centers and high-performance computing environments.
• Basic understanding of artificial intelligence (AI) concepts and terminology. This may
include knowledge of topics such as machine learning, deep learning, neural
networks, and common AI applications.
Practical experience in configuring and managing Cumulus Linux based network
environments.
Equivalent knowledge to “Cumulus Linux Professional” course.
• Familiarity with installing DOCA OFED on the host
• Equivalent knowledge to AI for All: From Basics to GenAI Practice course.