You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Enhanced support for Claude Sonnet 4.5, extending its maximum context window to 1 million tokens and enabling tiered pricing for more flexible usage models.
@@ -57,9 +57,9 @@ New models drop constantly, so this guide focuses on what's working well with HA
57
57
58
58
| If you want... | Use this |
59
59
|----------------|----------|
60
-
| Something that just works | Claude Sonnet 4 |
60
+
| Something that just works | Claude Sonnet 4.5|
61
61
| To save money | DeepSeek V3 or Qwen3 variants |
62
-
| Huge context windows | Gemini 2.5 Pro or Claude Sonnet 4 |
62
+
| Huge context windows | Gemini 2.5 Pro or Claude Sonnet 4.5|
63
63
| Open source | Qwen3 Coder, Z AI GLM 4.5, or Kimi K2 |
64
64
| Latest tech | GPT-5 |
65
65
| Speed | Qwen3 Coder on Cerebras (fastest available) |
@@ -71,6 +71,6 @@ HAI automatically handles context limits with [auto-compact](/features/auto-comp
71
71
72
72
## The Bottom Line
73
73
74
-
Start with **Claude Sonnet 4** if you want reliability. Experiment with **open source options** once you're comfortable to find the best fit for your workflow and budget.
74
+
Start with **Claude Sonnet 4.5** if you want reliability. Experiment with **open source options** once you're comfortable to find the best fit for your workflow and budget.
75
75
76
76
The landscape moves fast - these recommendations reflect what's working now, but keep an eye on new releases.
@@ -46,8 +43,8 @@ HAI users can leverage this by checking the `Enable Extended Thinking` box below
46
43
47
44
**Key Aspects of Extended Thinking:**
48
45
49
-
-**Supported Models:** This feature is available for select models, including variants of Claude Opus 4, Claude Sonnet 4, and Claude Sonnet 3.7. The specific models listed in the "Supported Models" section above with the `:thinking` suffix are pre-configured in HAI to utilize this.
50
-
-**Summarized Thinking (Claude 4):** For Claude 4 models, the API returns a summary of the full thinking process to balance insight with efficiency and prevent misuse. You are billed for the full thinking tokens, not just the summary.
46
+
-**Supported Models:** This feature is available for select models, including Claude Opus 4, Claude Sonnet 4.5, and Claude Sonnet 3.7.
47
+
-**Summarized Thinking (Claude 4):** For Claude 4 and 4.5 models, the API returns a summary of the full thinking process to balance insight with efficiency and prevent misuse. You are billed for the full thinking tokens, not just the summary.
51
48
-**Streaming:** Extended thinking responses, including the `thinking` blocks, can be streamed.
52
49
-**Tool Use & Prompt Caching:** Extended thinking interacts with tool use (requiring thinking blocks to be passed back) and prompt caching (with specific behaviors around cache invalidation and context).
0 commit comments