Home
News
Tech Grid
Data & Analytics
Data Processing Data Management Analytics Data Infrastructure Data Integration & ETL Data Governance & Quality Business Intelligence DataOps Data Lakes & Warehouses Data Quality Data Engineering Big Data
Enterprise Tech
Digital Transformation Enterprise Solutions Collaboration & Communication Low-Code/No-Code Automation IT Compliance & Governance Innovation Enterprise AI Data Management HR
Cybersecurity
Risk & Compliance Data Security Identity & Access Management Application Security Threat Detection & Incident Response Threat Intelligence AI Cloud Security Network Security Endpoint Security Edge AI
AI
Ethical AI Agentic AI Enterprise AI AI Assistants Innovation Generative AI Computer Vision Deep Learning Machine Learning Robotics & Automation LLMs Document Intelligence Business Intelligence Low-Code/No-Code Edge AI Automation NLP AI Cloud
Cloud
Cloud AI Cloud Migration Cloud Security Cloud Native Hybrid & Multicloud Cloud Architecture Edge Computing
IT & Networking
IT Automation Network Monitoring & Management IT Support & Service Management IT Infrastructure & Ops IT Compliance & Governance Hardware & Devices Virtualization End-User Computing Storage & Backup
Human Resource Technology Agentic AI Robotics & Automation Innovation Enterprise AI AI Assistants Enterprise Solutions Generative AI Regulatory & Compliance Network Security Collaboration & Communication Business Intelligence Leadership Artificial Intelligence Cloud
Finance
Insurance Investment Banking Financial Services Security Payments & Wallets Decentralized Finance Blockchain
HR
Talent Acquisition Workforce Management AI HCM HR Cloud Learning & Development Payroll & Benefits HR Analytics HR Automation Employee Experience Employee Wellness
Marketing
AI Customer Engagement Advertising Email Marketing CRM Customer Experience Data Management Sales Content Management Marketing Automation Digital Marketing Supply Chain Management Communications Business Intelligence Digital Experience SEO/SEM Digital Transformation Marketing Cloud Content Marketing E-commerce
Consumer Tech
Smart Home Technology Home Appliances Consumer Health AI
Interviews
Think Stack
Press Releases
Articles
Resources
  • Home
  • /
  • News
  • /
  • AI
  • /
  • Agentic AI
  • /
  • Cerebras and Core42 Launch OpenAI’s gpt-oss-120B at Record Speeds in 2025
  • Agentic AI

Cerebras and Core42 Launch OpenAI’s gpt-oss-120B at Record Speeds in 2025


Cerebras and Core42 Launch OpenAI’s gpt-oss-120B at Record Speeds in 2025
  • Source: Source Logo
  • |
  • August 29, 2025

Cerebras and Core42 have announced the global availability of OpenAI’s gpt-oss-120B, delivering record-breaking inference speeds of 3,000 tokens per second via Core42’s AI Cloud and Compass API, as revealed on August 28, 2025. This collaboration empowers enterprises, researchers, and governments with scalable, high-performance AI for real-time reasoning and agentic workloads.

Quick Intel

  • Cerebras and Core42 launch OpenAI’s gpt-oss-120B globally on August 28, 2025.

  • Achieves 3,000 tokens/second, verified by Artificial Analysis, surpassing GPU providers.

  • Pricing: $0.25/M input tokens, $0.69/M output tokens, with 128K context support.

  • Powered by Cerebras’ CS-3 system and wafer-scale engine (WSE) for ultra-low latency.

  • Core42’s AI Cloud enables seamless integration for enterprise-scale AI applications.

  • Supports semantic search, code execution, automation, and decision intelligence.

Unprecedented AI Performance

Cerebras, the world’s fastest AI provider, and Core42, a G42 company specializing in sovereign cloud and AI infrastructure, have partnered to deliver OpenAI’s gpt-oss-120B at unmatched speeds. “Together with Cerebras and Core42, we’re making our best and most usable open model available at unprecedented speed and scale,” said Trevor Cai, Head of Infrastructure at OpenAI. The collaboration leverages Cerebras’ CS-3 system and wafer-scale engine (WSE), achieving 3,000 tokens per second—outpacing NVIDIA’s Blackwell DGX B200 (900 tokens/second) by over 3x in single-user tests.

Key Features and Capabilities

  • Industry-Leading Speed: gpt-oss-120B delivers 3,000 tokens/second, enabling real-time applications like live coding assistants and instant document Q&A.

  • Long-Context Understanding: Supports 128K token context for complex, multi-turn reasoning.

  • Cost Efficiency: Priced at $0.25/M input tokens and $0.69/M output tokens, offering 8.4x price-performance advantage over median GPU clouds.

  • Seamless Integration: Core42’s Compass API allows instant API access, with no refactoring needed for OpenAI endpoint users.

  • Enterprise Scalability: Supports agentic AI for semantic search, code execution, and decision intelligence, scalable from experimentation to production.

Strategic Partnership Impact

“The latest chapter in our ongoing strategic partnership with Core42 now delivers the world’s most capable open-weight models directly into the hands of enterprises,” said Andrew Feldman, CEO of Cerebras. Core42’s AI Cloud ensures compliance and flexibility, while Cerebras’ WSE eliminates GPU bottlenecks, delivering ultra-low latency and deterministic performance. “By running OpenAI gpt-oss on Cerebras hardware within Core42’s AI Cloud, we are setting a new benchmark for performance, flexibility, and compliance,” said Kiril Evtimov, CEO of Core42.

Industry Significance

The gpt-oss-120B model, a 120B-parameter mixture-of-experts with 128 experts across 36 layers, rivals proprietary models like Gemini 2.5 Flash and Claude Opus 4 in math, science, and coding tasks. Its Apache 2.0 license enables fine-tuning and on-premises deployment, critical for sensitive data. The partnership aligns with the growing demand for agentic AI, projected to drive $1 trillion in economic impact by 2030, offering enterprises unmatched speed and affordability.

Availability and Access

Developers and enterprises can access gpt-oss-120B via Core42’s AI Cloud at https://aicloud.core42.ai or Cerebras Cloud with a free API key at cerebras.ai/openai. The platform supports high-throughput inference for workloads like reasoning and long-context generation, with pricing at $0.25/M input and $0.69/M output tokens.

About Cerebras Systems

Cerebras Systems, powered by its Wafer-Scale Engine-3 and CS-3 system, delivers the world’s fastest AI inference. Trusted by leading corporations and governments, Cerebras supports open-source models with millions of downloads, simplifying large-scale AI deployments.

About Core42

Core42, a G42 company, provides sovereign cloud and AI infrastructure, empowering enterprises with scalable, compliant solutions. Its Compass API delivers high-performance AI capabilities globally.

  • Cerebrasgptoss120BOpen AIAgentic AIAI Innovation
News Disclaimer
  • Share