Colette Kress
Executive Vice President and Chief Financial Officer at NVIDIA
Thanks, Simona. Q3 was another record quarter. Revenue of $18.1 billion was up 34% sequentially and up more than 200% year-on-year and well above our outlook for $16 billion.
Starting with Data Center. The continued ramp of the NVIDIA HGX platform based on our Hopper Tensor Core GPU architecture, along with InfiniBand end-to-end networking drove record revenue of $14.5 billion, up 41% sequentially and up 279% year-on-year. NVIDIA HGX with InfiniBand together are essentially the reference architecture for AI supercomputers and data center infrastructures. Some of the most exciting generative AI applications are built and run on NVIDIA, including Adobe Firefly, Chat GPT, Microsoft 365 Copilot co-assist now assist with ServiceNow and Zoom AI Companion. Our Data Center compute revenue quadrupled from last year and networking revenue nearly tripled.
Investments in infrastructure for training and inferencing large language models, deep learning, recommender systems and generative AI applications is fueling strong broad-based demand for NVIDIA accelerated computing. Inferencing is now a major workload for NVIDIA AI computing. Consumer Internet companies and enterprises drove exceptional sequential growth in Q3, comprising approximately half of our Data Center revenue, and outpacing total growth. Companies like Meta are in full production with deep learning, recommender systems and also investing in generative AI to help advertisers optimize images and text.
Most major consumer Internet companies are racing to ramp up generative AI deployment. The enterprise wave of AI adoption is now beginning. Enterprise software companies such as Adobe, Databricks, Snowflake and ServiceNow are adding AI copilots and assistants to their platforms. And broader enterprises are developing custom AI for vertical industry applications such as Tesla in autonomous driving.
Cloud service providers drove roughly the other half of our Data Center revenue in the quarter. Demand was strong from all hyperscale CSPs, as well as from a broadening set of GPU-specialized CSPs globally that are rapidly growing to address the new market opportunities in AI. NVIDIA H100 Tensor Core GPU instances are now generally available in virtually every cloud with instances in high demand. We have significantly increased supply every quarter this year to meet strong demand and expect to continue to do so next year. We will also have a broader and faster product launch cadence to meet the growing and diverse set of AI opportunities.
Towards the end of the quarter, the US government announced a new set of export control regulations for China and other markets, including Vietnam and certain countries in the Middle East. These regulations require licenses for the export of a number of our products, including our Hopper and Ampere 100 and 800 series and several others. Our sales to China and other affected destinations derived from products that are now subject to licensing requirements have consistently contributed approximately 20% to 25% of Data Center revenue over the past few quarters. We expect that our sales to these destinations will decline significantly in the fourth quarter. So we believe they'll be more than offset by strong growth in other regions.
The US government designed the regulation to allow the US industry to provide data center compute products to markets worldwide, including China. Continuing to compete worldwide as the regulations encourage, promotes US technology leadership, spurs economic growth and supports US jobs. For the highest performance levels, the government requires licenses. For lower performance levels, the government requires a streamlined prior notification process. And for products even lower performance levels, the government does not require any notice at all.
Following the government's clear guidelines, we are working to expand our Data Center product portfolio to offer compliance solutions for each regulatory category, including products for which the US government does not wish to have advance notice before each shipment. We are working with some customers in China and the Middle East to pursue licenses from the US government. It is too early to know whether these will be granted for any significant amount of revenue.
Many countries are awakening to the need to invest in sovereign AI infrastructure to support economic growth and industrial innovation. With investments in domestic compute capacity, nations can use their own data to train LLMs and support their local generative AI ecosystems. For example, we are working with India's government and largest tech companies including Infosys, Reliance and Tata to boost their sovereign AI infrastructure. And French private cloud provider Scaleway is building a regional AI cloud based on NVIDIA H100 InfiniBand and NVIDIA's AI Enterprise software to fuel advancement across France and Europe. National investment in compute capacity is a new economic imperative and serving the sovereign AI infrastructure market represents a multi-billion dollar opportunity over the next few years.
New from a product perspective, the vast majority of revenue in Q3 was driven by the NVIDIA HGX platform based on our Hopper GPU architecture with lower contribution from the prior generation Ampere GPU architecture. The new L40S GPU built for industry-standard servers began to ship, supporting training and inference workloads across a variety of consumers. This was also the first revenue quarter of our GH200 Grace Hopper Superchip, which combines our ARM-based Grace CPU with a Hopper GPU. Grace and Grace Hopper are ramping into a new multi-billion dollar product line. Grace Hopper instances are now available at GPU specialized cloud providers, and coming soon to Oracle Cloud.
Grace Hopper is also getting significant traction with supercomputing customers. Initial shipments to Los Alamos National Lab and the Swiss National Supercomputing Center took place in the third quarter. The UK government announced it will build one of the world's fastest AI supercomputers called Isambard-AI with almost 5,500 Grace Hopper Superchips. German supercomputing center, Julich, also announced that it will build its next-generation AI supercomputer with close to 24,000 Grace Hopper Superchips and Quantum-2 InfiniBand, making it the world's most powerful AI supercomputer with over 90 exaflops of AI performance. All-in, we estimate that the combined AI compute capacity of all the supercomputers built on Grace Hopper across the US, Europe and Japan next year will exceed 200 exaflops with more wins to come.
Inference is contributing significantly to our data center demand, as AI is now in full production for deep learning, recommenders, chatbots, copilots and text-to-image generation. And this is just the beginning. NVIDIA AI offers the best inference performance and versatility, and thus the lower power and cost of ownership. We are also driving a fast cost reduction curve. With the release of TensorRT-LLM, we now achieved more than 2x the inference performance for half the cost of inferencing LLMs on NVIDIA GPUs.
We also announced the latest member of the Hopper family, the H200, which will be the first GPU to offer HBM3e, faster, larger memory to further accelerate generative AI and LLMs. It boosts inference speed up to another 2x compared to H100 GPUs for running LLMs like Norma2 [Phonetic]. Combined, TensorRT-LLM and H200, increase performance or reduce cost by 4x in just one year. With our customers changing their stack, this is a benefit of CUDA and our architecture compatibility.
Compared to the A100, H200 delivers an 18x performance increase for inferencing models like GPT-3, allowing customers to move to larger models and with no increase in latency. Amazon Web Services, Google Cloud, Microsoft Azure and Oracle Cloud will be among the first CSPs to offer H200-based instances starting next year.
At last week's Microsoft Ignite, we deepened and expanded our collaboration with Microsoft across the entire stock. We introduced an AI foundry service for the development and tuning of custom generative AI enterprise applications running on Azure. Customers can bring their domain knowledge and proprietary data and we help them build their AI models using our AI expertise and software stock in our DGX cloud, all with enterprise-grade security and support. SAP and Amdocs are the first customers of the NVIDIA AI foundry service on Microsoft Azure. In addition, Microsoft will launch new confidential computing instances based on the H100.
The H100 remains the top-performing and most versatile platform for AI training and by a wide margin, as shown in the latest MLPerf industry benchmark results. Our training cluster included more than 10,000 H100 GPUs or 3x more than in June, reflecting very efficient scaling. Efficient scaling is a key requirement in generative AI, because LLMs are growing by an order of magnitude every year. Microsoft Azure achieved similar results on a nearly identical cluster, demonstrating the efficiency of NVIDIA AI in public cloud deployments.
Networking now exceeds a $10 billion annualized revenue run rate. Strong growth was driven by exceptional demand for InfiniBand, which grew fivefold year-on-year. InfiniBand is critical to gaining the scale and performance needed for training LLMs. Microsoft made this very point last week, highlighting that Azure uses over 29,000 miles of InfiniBand cabling, enough to circle the globe.
We are expanding NVIDIA networking into the Ethernet space. Our new Spectrum-X end-to-end Ethernet offering with technologies purpose-built for AI will be available in Q1 next year. With support from leading OEMs, including Dell, HPE and Lenovo. Spectrum-X can achieve 1.6x higher networking performance for AI communication compared to traditional Ethernet offerings.
Let me also provide an update on our software and services offerings, where we are starting to see excellent adoption. We are on track to exit the year at an annualized revenue run rate of $1 billion for our recurring software, support and services offerings. We see two primary opportunities for growth over the intermediate term with our DGX cloud service and with our NVIDIA AI Enterprise software, each reflects the growth of enterprise AI training and enterprise AI inference, respectively.
Our latest DGX cloud customer announcement was this morning as part of an AI research collaboration with Genentech, the biotechnology pioneer also plans to use our BioNeMo LLM framework to help accelerate and optimize their AI drug discovery platform. We now have enterprise AI partnership with Adobe, Dropbox, Getty, SAP, ServiceNow, Snowflake and others to come.
Okay. Moving to Gaming. Gaming revenue of $2.86 billion was up 15% sequentially and up more than 80% year-on-year with strong demand in the important back-to-school shopping season with NVIDIA RTX ray tracing and AI technology now available at price points as low as $299. We entered the holidays with the best-ever line-up for gamers and creators.
Gaming has doubled relative to pre-COVID levels even against the backdrop of lackluster PC market performance. This reflects the significant value we've brought to the gaming ecosystem with innovations like RTX and DLSS. The number of games and applications supporting these technologies has exploded in that period, driving upgrades and attracting new buyers. The RTX ecosystem continues to grow. There are now over 475 RTX-enabled games and applications.
Generative AI is quickly emerging as the new filler [Phonetic] app for high-performance PCs. NVIDIA RTX GPU define the most performant AI PCs and workstations. We just released TensorRT-LLM for Windows, which speeds on-device LLM inference up by 4x. With an installed base of over 100 million, NVIDIA RTX is the natural platform for AI application developers.
Finally, our GeForce NOW cloud gaming service continues to build momentum. Its library of PC games surpassed 1,700 titles, including the launches of Alan Wake 2, Baldur's Gate 3, Cyberpunk 2077: Phantom Liberty, and Starfield.
Moving to the Pro Vis. Revenue of $416 million was up 10% sequentially and up 108% year-on year. NVIDIA RTX is the workstation platform of choice for professional design, engineering and simulation use cases and AI is emerging as a powerful demand driver. Early applications include inference for AI imaging in healthcare and edge AI in smart spaces and the public sector.
We launched a new line of desktop workstations based on NVIDIA RTX Ada Lovelace generation GPUs and ConnectX, SmartNICs offering up to 2x the AI processing ray tracing and graphics performance of the previous generations. These powerful new workstations are optimized for AI workloads such as fine-tune AI models, changing smaller models and running inference locally.
We continue to make progress on Omniverse, our software platform for designing, building and operating 3D virtual worlds. Mercedes-Benz is using Omniverse-powered digital twins to plan, design, build and operate its manufacturing and assembly facilities, helping it increase efficiency and reduce defects. Oxxon [Phonetic] is also incorporating Omniverse into its manufacturing process, including end-to-end simulation for the entire robotics and automation pipeline, saving time and cost. We announced two new Omniverse Cloud services for automotive digitalization available on Microsoft Azure, a virtual factory simulation engine and autonomous vehicle simulation engine.
Moving to Automotive. Revenue was $261 million, up 3% sequentially and up 4% year-on year, primarily driven by continued growth in self-driving platforms based on NVIDIA DRIVE Orin SOC and the ramp of AI cockpit solutions with global OEM customers. We extended our automotive partnership of Foxconn to include NVIDIA DRIVE for our next-generation automotive SOC. Foxconn has become the ODM for EVs. Our partnership provides Foxconn with a standard AV sensor and computing platform for their customers to easily build a state-of-an-art safe and secure software-defined car.
Now we're going to move to the rest of the P&L. GAAP gross margin expanded to 74% and non-GAAP gross margin to 75%, driven by higher Data Center sales and lower net inventory reserve, including a 1 percentage point benefit from the release of previously-reserved inventory related to the Ampere GPU architecture products. Sequentially, GAAP operating expenses were up 12% and non-GAAP operating expenses were up 10%, primarily reflecting increased compensation and benefits.
Let me turn to the fourth quarter of fiscal 2024. Total revenue is expected to be $20 billion, plus or minus 2%. We expect strong sequential growth to be driven by Data Center, with continued strong demand for both compute and networking. Gaming will likely decline sequentially as-is now, as it is now more aligned with notebook seasonality. GAAP and non-GAAP gross margins are expected to be 74.5% and 75.5%, respectively, plus or minus 50 basis points. GAAP and non-GAAP operating expenses are expected to be approximately $3.17 billion and $2.2 billion, respectively. GAAP and non-GAAP other income and expenses are expected to be an income of approximately $200 million, excluding gains and losses from non-affiliated investments. GAAP and non-GAAP tax rates are expected to be 15%, plus or minus 1% excluding any discrete items.
Further financial information are included in the CFO commentary and other information available on our IR website.
In closing, let me highlight some upcoming events for the financial community. We will attend the UBS Global Technology Conference in Scottsdale, Arizona, on November 28th; the Wells Fargo TMT Summit in Rancho Palos Verdes, California on November 29th; the Arete Virtual Tech Conference on December 7th; and the J.P. Morgan Health Care Conference in San Francisco on January 8th. Our earnings call to discuss the results of our fourth quarter and fiscal 2024 is scheduled for Wednesday, February 21st.
We will now open the call for questions. Operator, will you please poll for questions.