Released Apr 30, 2025Knowledge cutoff Mar 31, 2025128,000 context
Qwen3-4B is a 4 billion parameter dense language model from the Qwen3 series, designed to support both general-purpose and reasoning-intensive tasks. It introduces a dual-mode architecture—thinking and non-thinking—allowing dynamic switching between high-precision logical reasoning and efficient dialogue generation. This makes it well-suited for multi-turn chat, instruction following, and complex agent workflows.
Qwen3 4B - API Pricing & Providers | OpenRouter
Recent activity on Qwen3 4B
Total usage per day on OpenRouter
Reasoning
81M
Prompt
33.7M
Completion
4.74M
Prompt tokens measure input size. Reasoning tokens show internal thinking before a response. Completion tokens reflect total output length.