Commit 2dea7c8
Use gpt-4o's tokenizer (#258)
feat: switch to o200k_base, require tiktoken ≥ 0.7.0, drop Python 3.7
Context
-------
Token counting now uses **o200k_base** (native to GPT-4o / 4o-mini).
That encoding ships only with **tiktoken ≥ 0.7.0**, whose wheels need Python 3.8+.
CI already tests 3.8-3.13, so we align our documented minimums.
Changes
-------
* src/gitingest/output_formatters.py – `cl100k_base` → `o200k_base`
* README.md – “Python 3.7+” → “Python 3.8+”
* pyproject.toml
* `tiktoken` → `tiktoken>=0.7.0` (o200k support)
* remove classifier *Programming Language :: Python :: 3.7*
* requirements.txt – same `tiktoken` bump
Impact
------
* **Breaking** for users pinned to Python 3.7 → upgrade to 3.8+.
* Environments on `tiktoken==0.6.*` must `pip install -U tiktoken>=0.7.0`.
* No other runtime deps added.
Co-authored-by: Filip Christiansen <22807962+filipchristiansen@users.noreply.github.com>1 parent 1dd133c commit 2dea7c8
File tree
4 files changed
+4
-5
lines changed- src/gitingest
4 files changed
+4
-5
lines changed| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
28 | 28 | | |
29 | 29 | | |
30 | 30 | | |
31 | | - | |
| 31 | + | |
32 | 32 | | |
33 | 33 | | |
34 | 34 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
11 | 11 | | |
12 | 12 | | |
13 | 13 | | |
14 | | - | |
| 14 | + | |
15 | 15 | | |
16 | 16 | | |
17 | 17 | | |
| |||
23 | 23 | | |
24 | 24 | | |
25 | 25 | | |
26 | | - | |
27 | 26 | | |
28 | 27 | | |
29 | 28 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
4 | 4 | | |
5 | 5 | | |
6 | 6 | | |
7 | | - | |
| 7 | + | |
8 | 8 | | |
9 | 9 | | |
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
171 | 171 | | |
172 | 172 | | |
173 | 173 | | |
174 | | - | |
| 174 | + | |
175 | 175 | | |
176 | 176 | | |
177 | 177 | | |
| |||
0 commit comments