NVIDIA has announced its collaboration with OpenAI to bring the new gpt-oss family of open models to consumers, allowing state-of-the-art AI that was once exclusive to cloud data centers to run with incredible speed on RTX-powered PCs and workstations.
The launch ushers in a new generation of faster, smarter on-device AI supercharged by the horsepower of GeForce RTX GPUs and PRO GPUs. Two new variants are available, designed to serve the entire ecosystem:
The gpt-oss-20b model is optimized to run at peak performance on NVIDIA RTX AI PCs with at least 16GB of VRAM, delivering up to 250 tokens per second on an RTX 5090 GPU. The larger gpt-oss-120b model is supported on professional workstations accelerated by NVIDIA RTX PRO GPUs.
Anyone can use the models to develop breakthrough applications in generative, reasoning and physical AI, healthcare and manufacturing — or even unlock new industries as the next industrial revolution driven by AI continues to unfold.
OpenAI’s new flexible, open-weight text-reasoning large language models (LLMs) were trained on NVIDIA H100 GPUs and run inference best on the hundreds of millions of GPUs running the NVIDIA CUDA platform across the globe.
The models are now available as NVIDIA NIM microservices, offering easy deployment on any GPU-accelerated infrastructure with flexibility, data privacy and enterprise-grade security.
With software optimizations for the NVIDIA Blackwell platform, the models offer optimal inference on NVIDIA GB200 NVL72 systems, achieving 1.5 million tokens per second — driving massive efficiency for inference.



No comments:
Post a Comment