Home
News
Tech Grid
Interviews
Anecdotes
Think Stack
Press Releases
Articles
  • Home
  • /
  • News
  • /
  • AI
  • /
  • Agentic AI
  • /
  • Z.ai Releases GLM-4.5, Redefining Open-Source AI with High Performance and Affordability
  • Agentic AI

Z.ai Releases GLM-4.5, Redefining Open-Source AI with High Performance and Affordability


Z.ai Releases GLM-4.5, Redefining Open-Source AI with High Performance and Affordability
  • by: Source Logo
  • |
  • July 29, 2025

Z.ai, formerly Zhipu, announced the release of its next-generation open-source model, GLM-4.5, on July 28, 2025, marking a significant advancement in China’s AI landscape. Built on a self-developed Mixture of Experts (MoE) architecture, GLM-4.5 achieves state-of-the-art (SOTA) performance among open-source models, offering unparalleled efficiency, accessibility, and affordability for developers and enterprises globally.

Quick Intel

  • Z.ai releases GLM-4.5 and GLM-4.5-Air on July 28, 2025.

  • Flagship GLM-4.5: 355B total parameters (32B active); GLM-4.5-Air: 106B (12B active).

  • SOTA open-source MoE model with native agentic capabilities.

  • Ranks third globally across 12 benchmarks, first among open-source models.

  • API pricing: $0.11/M input tokens, $0.28/M output tokens; >100 tokens/sec generation.

  • Available under MIT license on Hugging Face and Z.ai platforms.

Breakthrough Performance and Efficiency

GLM-4.5, with 355 billion total parameters (32B active), and its lighter counterpart, GLM-4.5-Air (106B total, 12B active), leverage a deep, narrow MoE architecture optimized with loss-free balance routing, sigmoid gating, and Grouped-Query Attention. Pre-trained on 22 trillion tokens, the models excel in reasoning, coding, and agentic tasks, ranking third globally across 12 benchmarks, behind only xAI’s Grok 4 and OpenAI’s o3. GLM-4.5-Air leads among ~100B parameter models with a 59.8 average benchmark score. Notable results include a 90.6% tool-calling success rate, surpassing Claude 3.5 Sonnet, and an 80.8% win rate in coding tasks against Alibaba’s Qwen3-Coder.

The models support a 128K token context window and achieve generation speeds exceeding 100 tokens/sec (up to 200 tokens/sec in high-speed mode), making them ideal for real-world applications. Running on just eight NVIDIA H20 chips, GLM-4.5 demonstrates remarkable hardware efficiency, addressing U.S. export restrictions while delivering GPT-4-level performance.

Agent-Native Design for Complex Applications

GLM-4.5 introduces an “agent-native” architecture, integrating reasoning, perception, and action into its core. This enables autonomous multi-step task planning, complex data visualization, and end-to-end workflow management, outperforming competitors like Claude 4 Sonnet on benchmarks such as BrowseComp and AIME24. The hybrid “thinking” and “non-thinking” modes optimize compute usage, balancing deep reasoning with fast responses. Features like Multi-Token Prediction (MTP) enhance inference speed by 2.5–8×, making GLM-4.5 a robust choice for agentic applications like web browsing, coding, and game development.

Accessibility and Affordability

Released under the MIT license, GLM-4.5 and GLM-4.5-Air are fully open-source, with weights available on Hugging Face and ModelScope. Developers can test the models for free via Z.ai’s platforms (chatglm.cn, z.ai) or integrate them through APIs on the BigModel platform. Priced at $0.11 per million input tokens and $0.28 per million output tokens, GLM-4.5 undercuts competitors like Anthropic’s Claude 3 Opus by up to 268x on output costs. Support for vLLM, SGLang, and mixed-precision inference ensures compatibility with existing frameworks, while GLM-4.5-Air’s 12B active design runs on consumer GPUs with 32–64GB VRAM, enabling on-premise and edge deployments.

Z.ai’s Global Impact and Responsible AI

Founded in 2019 as a Tsinghua University spin-off, Z.ai has amassed over 40 million downloads of its models, including GLM-130B and ChatGLM, since 2020. Backed by $1.5B from Alibaba, Tencent, and others, Z.ai is a key player in China’s AI ecosystem, which has released 1,509 LLMs by July 2025, leading globally. Recognized by OpenAI and Stanford’s 2025 AI Index Report, Z.ai was the first Chinese firm to sign the Frontier AI Safety Commitments, emphasizing ethical AI development. Despite U.S. Entity List sanctions, Z.ai’s focus on open-source models fosters global collaboration and reduces reliance on proprietary systems.

“This technology represents a major leap forward in our industry,” said Zhang Peng, CEO of Z.ai. “GLM-4.5 demonstrates that cutting-edge performance can be open, efficient, and affordable.” The launch, showcased at the World AI Conference 2025 in Shanghai, positions Z.ai to challenge Western AI giants, with plans for an IPO and GLM-5 in development.

 

About Z.ai

Founded in 2019, Z.ai (formerly Zhipu) is a leading Chinese AI company focused on developing next-generation cognitive models. Since launching the GLM pre-training framework in 2020, Z.ai has released several industry-leading models, including GLM-130B and ChatGLM, with over 40 million global downloads. Its product suite includes Z.ai, QingYan, CodeGeeX, CogVLM, and CogView, as well as an innovative Model-as-a-Service (MaaS) platform for developers and enterprises. Recognized with numerous industry awards, Z.ai is driving progress towards the era of artificial general intelligence.

  • AgenticAIMixtureOfExpertsOpenSourceAI GLM45Zai
News Disclaimer
  • Share