NVIDIA announced the launch of the Physical AI Data Factory Blueprint at the GTC conference. This is an open reference architecture designed to automate data generation and enhance evaluation processes. Through modular workflows, it helps developers transform raw data into high-quality training sets, reducing costs and technical barriers for large-scale development of robots, visual AI agents, and autonomous vehicles.
What is the Physical AI Data Factory Blueprint?
Physical AI relies heavily on the synchronized growth of data volume and model capacity. The blueprint provides a universal standard that integrates previously fragmented data processing procedures into dedicated automated pipelines. Using Cosmos Curator, it handles, refines, and annotates large-scale real-world and synthetic data. Then, with Cosmos Transfer technology, it exponentially expands and diversifies the selected data to simulate different environments, lighting conditions, and rare edge cases that are difficult to capture in reality. Finally, Cosmos Reason-driven evaluators automatically score and verify the generated data to ensure physical accuracy and training standards, solving the inefficiency of manual data filtering in the past.
To support large-scale computing needs, NVIDIA collaborates with cloud service providers like Microsoft Azure and Nebius to integrate this blueprint into existing cloud infrastructures. Microsoft Azure incorporates this architecture into its open physical AI toolchain on GitHub, combining IoT operations and real-time intelligence services to offer enterprise-level agent-driven workflows. Nebius has integrated the OSMO orchestration framework into its AI cloud platform, supporting RTX PRO 6000 Blackwell server-grade GPUs, providing developers with an end-to-end stack from data management and annotation to serverless inference and hosting. This architecture enables developers to convert accelerated computing power directly into massive training data, speeding up autonomous system development.
NVIDIA OSMO Open Source Orchestration Framework for Managing Software Orchestration and Coding Agents
For development teams lacking large-scale AI infrastructure management capabilities, the Physical AI Data Factory introduces NVIDIA’s open-source OSMO orchestration framework. OSMO’s core function is managing complex workflows across computing environments, reducing manual intervention, and allowing technical teams to focus more on model optimization. Currently, OSMO has integrated with coding agents like Claude, OpenAI Codex, and Cursor, enabling native AI operations. In this mode, agents proactively manage computing resources, identify and eliminate system bottlenecks, and shorten the cycle from model development to deployment.
Several physical AI developers have already begun adopting the blueprint, applying it across various fields. Skild AI uses this architecture to develop general robot foundational models; Uber applies it to autonomous vehicle research and validation. Additionally, companies like FieldAI, Hexagon Robotics, BMW, Linker Vision, Milestone Systems, and Teradyne Robotics are leveraging the blueprint to strengthen their data production capabilities in perception, mobility, and reinforcement learning pipelines.
This article, GTC 2026: NVIDIA Unveils Physical AI Data Factory to Accelerate Robot and Self-Driving Car Development, first appeared on Chain News ABMedia.