Skip to content

Commit 4c9d30b

Browse files
feat: Enhance Claude Sonnet 4.5 support with 1M context window and tiered pricing (#209)
* feat: Enhance Claude Sonnet 4.5 support with 1M context window and tiered pricing * doc: updated changeset --------- Co-authored-by: Magesh <mageshmscss@gmail.com>
1 parent f50841e commit 4c9d30b

File tree

15 files changed

+197
-87
lines changed

15 files changed

+197
-87
lines changed

.changeset/kind-games-remain.md

Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,5 @@
1+
---
2+
"hai-build-code-generator": patch
3+
---
4+
5+
Enhanced support for Claude Sonnet 4.5, extending its maximum context window to 1 million tokens and enabling tiered pricing for more flexible usage models.

.github/ISSUE_TEMPLATE/bug_report.yml

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -5,7 +5,7 @@ body:
55
- type: markdown
66
attributes:
77
value: |
8-
**Important:** All bug reports must be reproducible using Claude 4 Sonnet. HAI uses complex prompts so less capable models may not work as expected.
8+
**Important:** All bug reports must be reproducible using Claude Sonnet 4.5. HAI uses complex prompts so less capable models may not work as expected.
99
- type: textarea
1010
id: what-happened
1111
attributes:
@@ -36,7 +36,7 @@ body:
3636
attributes:
3737
label: Provider/Model
3838
description: What provider and model were you using when the issue occurred?
39-
placeholder: "e.g., anthropic/claude-3.7-sonnet, gemini:gemini-2.5-pro-exp-03-25"
39+
placeholder: "e.g., anthropic/claude-sonnet-4.5, gemini:gemini-2.5-pro-exp-03-25"
4040
validations:
4141
required: true
4242
- type: textarea

hai-docs/getting-started/model-selection-guide.mdx

Lines changed: 4 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -9,7 +9,7 @@ New models drop constantly, so this guide focuses on what's working well with HA
99

1010
| Model | Context Window | Input Price* | Output Price* | Best For |
1111
|-------|---------------|--------------|---------------|----------|
12-
| **Claude Sonnet 4** | 1M tokens | $3-6 | $15-22.50 | Reliable tool usage, complex codebases |
12+
| **Claude Sonnet 4.5** | 1M tokens | $3-6 | $15-22.50 | Reliable tool usage, complex codebases |
1313
| **Qwen3 Coder** | 256K tokens | $0.20 | $0.80 | Coding tasks, open source flexibility |
1414
| **Gemini 2.5 Pro** | 1M+ tokens | TBD | TBD | Large codebases, document analysis |
1515
| **GPT-5** | 400K tokens | $1.25 | $10 | Latest OpenAI tech, three modes |
@@ -57,9 +57,9 @@ New models drop constantly, so this guide focuses on what's working well with HA
5757

5858
| If you want... | Use this |
5959
|----------------|----------|
60-
| Something that just works | Claude Sonnet 4 |
60+
| Something that just works | Claude Sonnet 4.5 |
6161
| To save money | DeepSeek V3 or Qwen3 variants |
62-
| Huge context windows | Gemini 2.5 Pro or Claude Sonnet 4 |
62+
| Huge context windows | Gemini 2.5 Pro or Claude Sonnet 4.5 |
6363
| Open source | Qwen3 Coder, Z AI GLM 4.5, or Kimi K2 |
6464
| Latest tech | GPT-5 |
6565
| Speed | Qwen3 Coder on Cerebras (fastest available) |
@@ -71,6 +71,6 @@ HAI automatically handles context limits with [auto-compact](/features/auto-comp
7171

7272
## The Bottom Line
7373

74-
Start with **Claude Sonnet 4** if you want reliability. Experiment with **open source options** once you're comfortable to find the best fit for your workflow and budget.
74+
Start with **Claude Sonnet 4.5** if you want reliability. Experiment with **open source options** once you're comfortable to find the best fit for your workflow and budget.
7575

7676
The landscape moves fast - these recommendations reflect what's working now, but keep an eye on new releases.

hai-docs/getting-started/understanding-context-management.mdx

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -53,7 +53,7 @@ Think of context like a whiteboard you and HAI share:
5353
- **Context Window** is the size of the whiteboard itself:
5454
- Measured in tokens (1 token ≈ 3/4 of an English word)
5555
- Each model has a fixed size:
56-
- Claude Sonnet 4: 1,000,000 tokens
56+
- Claude Sonnet 4.5: 1,000,000 tokens
5757
- Qwen3 Coder: 256,000 tokens
5858
- Gemini 2.5 Pro: 1,000,000+ tokens
5959
- GPT-5: 400,000 tokens
@@ -77,7 +77,7 @@ HAI provides a visual way to monitor your context window usage through a progres
7777
- ↑ shows input tokens (what you've sent to the LLM)
7878
- ↓ shows output tokens (what the LLM has generated)
7979
- The progress bar visualizes how much of your context window you've used
80-
- The total shows your model's maximum capacity (e.g., 1M for Claude Sonnet 4)
80+
- The total shows your model's maximum capacity (e.g., 1M for Claude Sonnet 4.5)
8181

8282
### When to Watch the Bar
8383

hai-docs/provider-config/anthropic.mdx

Lines changed: 3 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -17,11 +17,8 @@ description: "Learn how to configure and use Anthropic Claude models with HAI. C
1717
HAI supports the following Anthropic Claude models:
1818

1919
- `claude-opus-4-20250514`
20-
- `claude-opus-4-20250514:thinking` (Extended Thinking variant)
21-
- `claude-sonnet-4-20250514` (Recommended)
22-
- `claude-sonnet-4-20250514:thinking` (Extended Thinking variant)
20+
- `anthropic/claude-sonnet-4.5` (Recommended)
2321
- `claude-3-7-sonnet-20250219`
24-
- `claude-3-7-sonnet-20250219:thinking` (Extended Thinking variant)
2522
- `claude-3-5-sonnet-20241022`
2623
- `claude-3-5-haiku-20241022`
2724
- `claude-3-opus-20240229`
@@ -46,8 +43,8 @@ HAI users can leverage this by checking the `Enable Extended Thinking` box below
4643

4744
**Key Aspects of Extended Thinking:**
4845

49-
- **Supported Models:** This feature is available for select models, including variants of Claude Opus 4, Claude Sonnet 4, and Claude Sonnet 3.7. The specific models listed in the "Supported Models" section above with the `:thinking` suffix are pre-configured in HAI to utilize this.
50-
- **Summarized Thinking (Claude 4):** For Claude 4 models, the API returns a summary of the full thinking process to balance insight with efficiency and prevent misuse. You are billed for the full thinking tokens, not just the summary.
46+
- **Supported Models:** This feature is available for select models, including Claude Opus 4, Claude Sonnet 4.5, and Claude Sonnet 3.7.
47+
- **Summarized Thinking (Claude 4):** For Claude 4 and 4.5 models, the API returns a summary of the full thinking process to balance insight with efficiency and prevent misuse. You are billed for the full thinking tokens, not just the summary.
5148
- **Streaming:** Extended thinking responses, including the `thinking` blocks, can be streamed.
5249
- **Tool Use & Prompt Caching:** Extended thinking interacts with tool use (requiring thinking blocks to be passed back) and prompt caching (with specific behaviors around cache invalidation and context).
5350

src/core/api/providers/anthropic.ts

Lines changed: 8 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,6 @@
11
import { Anthropic } from "@anthropic-ai/sdk"
22
import { Stream as AnthropicStream } from "@anthropic-ai/sdk/streaming"
3-
import { AnthropicModelId, anthropicDefaultModelId, anthropicModels, CLAUDE_SONNET_4_1M_SUFFIX, ModelInfo } from "@shared/api"
3+
import { AnthropicModelId, anthropicDefaultModelId, anthropicModels, CLAUDE_SONNET_1M_SUFFIX, ModelInfo } from "@shared/api"
44
import { ApiHandler, CommonApiHandlerOptions } from "../index"
55
import { withRetry } from "../retry"
66
import { ApiStream } from "../transform/stream"
@@ -45,16 +45,18 @@ export class AnthropicHandler implements ApiHandler {
4545
const model = this.getModel()
4646
let stream: AnthropicStream<Anthropic.RawMessageStreamEvent>
4747

48-
const modelId = model.id.endsWith(CLAUDE_SONNET_4_1M_SUFFIX)
49-
? model.id.slice(0, -CLAUDE_SONNET_4_1M_SUFFIX.length)
50-
: model.id
51-
const enable1mContextWindow = model.id.endsWith(CLAUDE_SONNET_4_1M_SUFFIX)
48+
const modelId = model.id.endsWith(CLAUDE_SONNET_1M_SUFFIX) ? model.id.slice(0, -CLAUDE_SONNET_1M_SUFFIX.length) : model.id
49+
const enable1mContextWindow = model.id.endsWith(CLAUDE_SONNET_1M_SUFFIX)
5250

5351
const budget_tokens = this.options.thinkingBudgetTokens || 0
54-
const reasoningOn = !!((modelId.includes("3-7") || modelId.includes("4-")) && budget_tokens !== 0)
52+
const reasoningOn = !!(
53+
(modelId.includes("3-7") || modelId.includes("4-") || modelId.includes("4-5")) &&
54+
budget_tokens !== 0
55+
)
5556

5657
switch (modelId) {
5758
// 'latest' alias does not support cache_control
59+
case "claude-sonnet-4-5-20250929":
5860
case "claude-sonnet-4-20250514":
5961
case "claude-3-7-sonnet-20250219":
6062
case "claude-3-5-sonnet-20241022":

src/core/api/providers/bedrock.ts

Lines changed: 8 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -9,7 +9,7 @@ import {
99
InvokeModelWithResponseStreamCommand,
1010
} from "@aws-sdk/client-bedrock-runtime"
1111
import { fromNodeProviderChain } from "@aws-sdk/credential-providers"
12-
import { BedrockModelId, bedrockDefaultModelId, bedrockModels, CLAUDE_SONNET_4_1M_SUFFIX, ModelInfo } from "@shared/api"
12+
import { BedrockModelId, bedrockDefaultModelId, bedrockModels, CLAUDE_SONNET_1M_SUFFIX, ModelInfo } from "@shared/api"
1313
import { calculateApiCostOpenAI } from "@utils/cost"
1414
import { ApiHandler, CommonApiHandlerOptions } from "../"
1515
import { withRetry } from "../retry"
@@ -119,11 +119,11 @@ export class AwsBedrockHandler implements ApiHandler {
119119
// cross region inference requires prefixing the model id with the region
120120
const rawModelId = await this.getModelId()
121121

122-
const modelId = rawModelId.endsWith(CLAUDE_SONNET_4_1M_SUFFIX)
123-
? rawModelId.slice(0, -CLAUDE_SONNET_4_1M_SUFFIX.length)
122+
const modelId = rawModelId.endsWith(CLAUDE_SONNET_1M_SUFFIX)
123+
? rawModelId.slice(0, -CLAUDE_SONNET_1M_SUFFIX.length)
124124
: rawModelId
125125

126-
const enable1mContextWindow = rawModelId.endsWith(CLAUDE_SONNET_4_1M_SUFFIX)
126+
const enable1mContextWindow = rawModelId.endsWith(CLAUDE_SONNET_1M_SUFFIX)
127127

128128
const model = this.getModel()
129129

@@ -741,7 +741,10 @@ export class AwsBedrockHandler implements ApiHandler {
741741
*/
742742
private shouldEnableReasoning(baseModelId: string, budgetTokens: number): boolean {
743743
return (
744-
(baseModelId.includes("3-7") || baseModelId.includes("sonnet-4") || baseModelId.includes("opus-4")) &&
744+
(baseModelId.includes("3-7") ||
745+
baseModelId.includes("sonnet-4") ||
746+
baseModelId.includes("opus-4") ||
747+
baseModelId.includes("sonnet-4-5")) &&
745748
budgetTokens !== 0
746749
)
747750
}

src/core/api/transform/openrouter-stream.ts

Lines changed: 16 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,10 @@
11
import { Anthropic } from "@anthropic-ai/sdk"
2-
import { CLAUDE_SONNET_4_1M_SUFFIX, ModelInfo, openRouterClaudeSonnet41mModelId } from "@shared/api"
2+
import {
3+
CLAUDE_SONNET_1M_SUFFIX,
4+
ModelInfo,
5+
openRouterClaudeSonnet41mModelId,
6+
openRouterClaudeSonnet451mModelId,
7+
} from "@shared/api"
38
import OpenAI from "openai"
49
import { isGPT5ModelFamily } from "../../prompts/system-prompt/utils"
510
import { convertToOpenAiMessages } from "./openai-format"
@@ -20,16 +25,18 @@ export async function createOpenRouterStream(
2025
...convertToOpenAiMessages(messages),
2126
]
2227

23-
const isClaudeSonnet41m = model.id === openRouterClaudeSonnet41mModelId
24-
if (isClaudeSonnet41m) {
28+
const isClaudeSonnet1m = model.id === openRouterClaudeSonnet41mModelId || model.id === openRouterClaudeSonnet451mModelId
29+
if (isClaudeSonnet1m) {
2530
// remove the custom :1m suffix, to create the model id openrouter API expects
26-
model.id = model.id.slice(0, -CLAUDE_SONNET_4_1M_SUFFIX.length)
31+
model.id = model.id.slice(0, -CLAUDE_SONNET_1M_SUFFIX.length)
2732
}
2833

2934
// prompt caching: https://openrouter.ai/docs/prompt-caching
3035
// this was initially specifically for claude models (some models may 'support prompt caching' automatically without this)
3136
// handles direct model.id match logic
3237
switch (model.id) {
38+
case "anthropic/claude-sonnet-4.5":
39+
case "anthropic/claude-4.5-sonnet":
3340
case "anthropic/claude-sonnet-4":
3441
case "anthropic/claude-opus-4.1":
3542
case "anthropic/claude-opus-4":
@@ -89,6 +96,8 @@ export async function createOpenRouterStream(
8996
// (models usually default to max tokens allowed)
9097
let maxTokens: number | undefined
9198
switch (model.id) {
99+
case "anthropic/claude-sonnet-4.5":
100+
case "anthropic/claude-4.5-sonnet":
92101
case "anthropic/claude-sonnet-4":
93102
case "anthropic/claude-opus-4.1":
94103
case "anthropic/claude-opus-4":
@@ -125,6 +134,8 @@ export async function createOpenRouterStream(
125134

126135
let reasoning: { max_tokens: number } | undefined
127136
switch (model.id) {
137+
case "anthropic/claude-sonnet-4.5":
138+
case "anthropic/claude-4.5-sonnet":
128139
case "anthropic/claude-sonnet-4":
129140
case "anthropic/claude-opus-4.1":
130141
case "anthropic/claude-opus-4":
@@ -186,7 +197,7 @@ export async function createOpenRouterStream(
186197
? { provider: { order: ["groq", "together", "baseten", "parasail", "novita", "deepinfra"], allow_fallbacks: false } }
187198
: {}),
188199
// limit providers to only those that support the 1m context window
189-
...(isClaudeSonnet41m ? { provider: { order: ["anthropic", "amazon-bedrock"], allow_fallbacks: false } } : {}),
200+
...(isClaudeSonnet1m ? { provider: { order: ["anthropic", "google-vertex/global"], allow_fallbacks: false } } : {}),
190201
})
191202

192203
return stream

src/core/controller/models/refreshOpenRouterModels.ts

Lines changed: 16 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -6,7 +6,12 @@ import axios from "axios"
66
import cloneDeep from "clone-deep"
77
import fs from "fs/promises"
88
import path from "path"
9-
import { CLAUDE_SONNET_4_1M_TIERS, clineMicrowaveAlphaModelInfo, openRouterClaudeSonnet41mModelId } from "@/shared/api"
9+
import {
10+
CLAUDE_SONNET_1M_TIERS,
11+
clineMicrowaveAlphaModelInfo,
12+
openRouterClaudeSonnet41mModelId,
13+
openRouterClaudeSonnet451mModelId,
14+
} from "@/shared/api"
1015
import { Controller } from ".."
1116

1217
type OpenRouterSupportedParams =
@@ -109,6 +114,8 @@ export async function refreshOpenRouterModels(
109114
})
110115

111116
switch (rawModel.id) {
117+
case "anthropic/claude-sonnet-4.5":
118+
case "anthropic/claude-4.5-sonnet":
112119
case "anthropic/claude-sonnet-4":
113120
case "anthropic/claude-3-7-sonnet":
114121
case "anthropic/claude-3-7-sonnet:beta":
@@ -204,11 +211,14 @@ export async function refreshOpenRouterModels(
204211
models[rawModel.id] = modelInfo
205212

206213
// add custom :1m model variant
207-
if (rawModel.id === "anthropic/claude-sonnet-4") {
208-
const claudeSonnet41mModelInfo = cloneDeep(modelInfo)
209-
claudeSonnet41mModelInfo.contextWindow = 1_000_000 // limiting providers to those that support 1m context window
210-
claudeSonnet41mModelInfo.tiers = CLAUDE_SONNET_4_1M_TIERS
211-
models[openRouterClaudeSonnet41mModelId] = claudeSonnet41mModelInfo
214+
if (rawModel.id === "anthropic/claude-sonnet-4" || rawModel.id === "anthropic/claude-sonnet-4.5") {
215+
const claudeSonnet1mModelInfo = cloneDeep(modelInfo)
216+
claudeSonnet1mModelInfo.contextWindow = 1_000_000 // limiting providers to those that support 1m context window
217+
claudeSonnet1mModelInfo.tiers = CLAUDE_SONNET_1M_TIERS
218+
// sonnet 4
219+
models[openRouterClaudeSonnet41mModelId] = claudeSonnet1mModelInfo
220+
// sonnet 4.5
221+
models[openRouterClaudeSonnet451mModelId] = claudeSonnet1mModelInfo
212222
}
213223
}
214224

src/extension.ts

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -426,7 +426,7 @@ export async function activate(context: vscode.ExtensionContext) {
426426

427427
// Register the command handlers
428428
context.subscriptions.push(
429-
vscode.commands.registerCommand("cline.addToChat", async (range?: vscode.Range, diagnostics?: vscode.Diagnostic[]) => {
429+
vscode.commands.registerCommand("hai.addToChat", async (range?: vscode.Range, diagnostics?: vscode.Diagnostic[]) => {
430430
const context = await getContextForCommand(range, diagnostics)
431431
if (!context) {
432432
return

0 commit comments

Comments
 (0)