forked from ggml-org/llama.cpp
-
Notifications
You must be signed in to change notification settings - Fork 5
(staging PR) server: add model management and proxy #39
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Open
ngxson
wants to merge
171
commits into
master
Choose a base branch
from
xsn/server_model_management_v1_2
base: master
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
+11,511
−4,224
Open
Changes from 28 commits
Commits
Show all changes
171 commits
Select commit
Hold shift + click to select a range
fc5901a
server: add model management and proxy
ngxson 399f536
fix compile error
ngxson abc0ca4
does this fix windows?
ngxson 54b3545
fix windows build
ngxson 5423d42
use subprocess.h, better logging
ngxson 0ef3b61
add test
ngxson 7c6eb17
fix windows
ngxson 919d3f8
Merge branch 'master' into xsn/server_model_management_v1_2
ngxson 55d33a8
feat: Model/Router server architecture WIP
allozaur b9ebdf6
more stable
ngxson 6610724
fix unsafe pointer
ngxson d0ea9e0
also allow terminate loading model
ngxson 5805ca7
add is_active()
ngxson 8a88576
refactor: Architecture improvements
allozaur c35dee3
Merge remote-tracking branch 'ngxson/xsn/server_model_management_v1_2…
allozaur 2161408
tmp apply upstream fix
ngxson 5369aaa
address most problems
ngxson 6929c9f
address thread safety issue
ngxson be25bcc
address review comment
ngxson cd5c699
add docs (first version)
ngxson a2e912c
address review comment
ngxson 4bf82a1
feat: Improved UX for model information, modality interactions etc
allozaur cc88f6a
chore: update webui build output
allozaur 45bf2a4
Merge remote-tracking branch 'ngxson/xsn/server_model_management_v1_2…
allozaur 049f40d
refactor: Use only the message data `model` property for displaying m…
allozaur c26c340
chore: update webui build output
allozaur 032b9ff
add --models-dir param
ngxson 8b1d967
feat: New Model Selection UX WIP
allozaur 6b7c0a5
chore: update webui build output
allozaur 69503aa
feat: Add auto-mic setting
allozaur 92585c7
feat: Attachments UX improvements
allozaur 62ee883
implement LRU
ngxson 7cd9290
remove default model path
ngxson 7241558
better --models-dir
ngxson b0540e8
add env for args
ngxson 525e274
address review comments
ngxson 457fbda
fix compile
ngxson c274f13
refactor: Chat Form Submit component
allozaur f2ca54b
Merge branch 'master' into xsn/server_model_management_v1_2
ngxson d32bbfe
ad endpoint docs
ngxson 4af1b6c
Merge remote-tracking branch 'webui/allozaur/server_model_management_…
ngxson 076eec6
feat: Add copy to clipboard to model name in model info dialog
allozaur db8ed5d
feat: Model unavailable UI state for model selector
allozaur dc913ec
feat: Chat Form Actions UI logic improvements
allozaur a39ef24
feat: Auto-select model from last assistant response
allozaur 036cc93
chore: update webui build output
allozaur 6282537
Merge remote-tracking branch 'ngxson/xsn/server_model_management_v1_2…
allozaur f25bfab
expose args and exit_code in API
ngxson 7ef6312
add note
ngxson f927e21
support extra_args on loading model
ngxson 74685f4
allow reusing args if auto_load
ngxson f95f9c5
typo docs
ngxson 2e355c7
oai-compat /models endpoint
ngxson 5ad594e
cleaner
ngxson d65be91
address review comments
ngxson 1f0cb3a
feat: Use `model` property for displaying the `repo/model-name` namin…
allozaur b7ba13b
refactor: Attachments data
allozaur 48dbef1
chore: update webui build output
allozaur 1c214e9
refactor: Enum imports
allozaur ef5f9d0
feat: Improve Model Selector responsiveness
allozaur 49c8062
chore: update webui build output
allozaur d5a6671
refactor: Cleanup
allozaur f8ff39c
refactor: Cleanup
allozaur 41764b8
refactor: Formatters
allozaur 219fd19
chore: update webui build output
allozaur e92ce07
refactor: Copy To Clipboard Icon component
allozaur fb5445e
chore: update webui build output
allozaur 39fb1c2
refactor: Cleanup
allozaur 188d323
chore: update webui build output
allozaur 16747de
refactor: UI badges
allozaur e808f2b
chore: update webui build output
allozaur 76557cd
Merge remote-tracking branch 'ngxson/xsn/server_model_management_v1_2…
allozaur 13fe860
refactor: Cleanup
allozaur b2590a7
refactor: Cleanup
allozaur 5ef3f99
chore: update webui build output
allozaur 6ed192b
add --models-allow-extra-args for security
ngxson 2c6b58f
nits
ngxson 539cbf0
add stdin_file
ngxson 399b39f
Merge branch 'master' into xsn/server_model_management_v1_2
ngxson e514b86
fix merge
ngxson 11c26ec
Merge remote-tracking branch 'ngxson/xsn/server_model_management_v1_2…
allozaur 7db3d87
fix: Retrieve lost setting after resolving merge conflict
allozaur ccd6c27
refactor: DatabaseStore -> DatabaseService
allozaur fed6c82
refactor: Database, Conversations & Chat services + stores architectu…
allozaur f9c911d
refactor: Remove redundant settings
allozaur 501badc
refactor: Multi-model business logic WIP
allozaur 4c24ead
chore: update webui build output
allozaur b9a3129
feat: Switching models logic for ChatForm or when regenerating messge…
allozaur 0132449
chore: update webui build output
allozaur 82975a1
fix: Add `untrack` inside chat processing info data logic to prevent …
allozaur 33356f3
fix: Regenerate
allozaur c680083
feat: Remove redundant settigns + rearrange
allozaur 5207527
fix: Audio attachments
allozaur 22507fe
refactor: Icons
allozaur 81b8e1a
chore: update webui build output
allozaur 2a280b6
feat: Model management and selection features WIP
allozaur 19e5385
chore: update webui build output
allozaur b1cf8bb
refactor: Improve server properties management
allozaur 23a91cd
refactor: Icons
allozaur d0d7a88
chore: update webui build output
allozaur 284557c
feat: Improve model loading/unloading status updates
allozaur 9431f35
chore: update webui build output
allozaur ddf98bd
refactor: Improve API header management via utility functions
allozaur e40f35f
remove support for extra args
ngxson e2731c3
set hf_repo/docker_repo as model alias when posible
ngxson becc602
Merge branch 'master' into xsn/server_model_management_v1_2
ngxson 42483f4
refactor: Remove ConversationsService
allozaur 456828b
refactor: Chat requests abort handling
allozaur d6ee3d1
refactor: Server store
allozaur 1493ee0
tmp webui build
ngxson 13e7988
refactor: Model modality handling
allozaur 2a5922b
chore: update webui build output
allozaur 6b95118
refactor: Processing state reactivity
allozaur 69065dd
fix: UI
allozaur 6a3d6e7
refactor: Services/Stores syntax + logic improvements
allozaur 78ead49
Merge remote-tracking branch 'ngxson/xsn/server_model_management_v1_2…
allozaur d733537
refactor: Architecture cleanup
allozaur 9086bc3
feat: Improve statistic badges
allozaur db47952
feat: Condition available models based on modality + better model loa…
allozaur bc57726
docs: Architecture documentation
allozaur bdaf44a
Merge branch 'master' into xsn/server_model_management_v1_2
ngxson 491fe2d
feat: Update logic for PDF as Image
allozaur 7be833d
add TODO for http client
ngxson eed1bd9
refactor: Enhance model info and attachment handling
allozaur 3470b12
chore: update webui build output
allozaur 5fadd0f
refactor: Components naming
allozaur 04ef4a0
chore: update webui build output
allozaur 1cf5daa
refactor: Cleanup
allozaur 68b653e
refactor: DRY `getAttachmentDisplayItems` function + fix UI
allozaur 171a092
chore: update webui build output
allozaur dd30810
fix: Modality detection improvement for text-based PDF attachments
allozaur 1adf173
refactor: Cleanup
allozaur 2f97dbf
docs: Add info comment
allozaur c76de5e
refactor: Cleanup
allozaur 4d16459
re
allozaur f50ce7b
refactor: Cleanup
allozaur d49d97c
refactor: Cleanup
allozaur 648d2de
feat: Attachment logic & UI improvements
allozaur 27b1522
refactor: Constants
allozaur 2464e06
feat: Improve UI sidebar background color
allozaur ce9c9af
chore: update webui build output
allozaur 493ef08
refactor: Utils imports + move types to `app.d.ts`
allozaur 2d556bb
test: Fix Storybook mocks
allozaur a568e74
chore: update webui build output
allozaur 33b9cc4
Merge branch 'master' into allozaur/server_model_management_v1_2
allozaur 4f39da8
test: Update Chat Form UI tests
allozaur 949b5fd
refactor: Tooltip Provider from core layout
allozaur ae8a1e8
refactor: Tests to separate location
allozaur 6fd720e
Merge remote-tracking branch 'origin/allozaur/server_model_management…
allozaur c1dfccd
Merge branch 'master' into xsn/server_model_management_v1_2
ngxson a82dbbf
decouple server_models from server_routes
ngxson 360a5ed
test: Move demo test to tests/server
allozaur acd3c58
refactor: Remove redundant method
allozaur e8b9d74
chore: update webui build output
allozaur 23cb411
also route anthropic endpoints
ngxson 802e77e
Merge remote-tracking branch 'webui/allozaur/server_model_management_…
ngxson 7b28b5e
fix duplicated arg
ngxson 4a1c05c
fix invalid ptr to shutdown_handler
ngxson d182544
server : minor
ggerganov f2dbe9c
rm unused fn
ngxson c330407
add ?autoload=true|false query param
ngxson 05cc22f
Merge branch 'master' into xsn/server_model_management_v1_2
ngxson 689ca09
refactor: Remove redundant code
allozaur 7a95348
Merge remote-tracking branch 'ngxson/xsn/server_model_management_v1_2…
allozaur 73056fb
docs: Update README documentations + architecture & data flow diagrams
allozaur c49467a
fix: Disable autoload on calling server props for the model
allozaur 9d3b718
chore: update webui build output
allozaur a6d3f83
fix ubuntu build
ngxson b926cfa
fix: Model status reactivity
allozaur 01ed8ce
fix: Modality detection for MODEL mode
allozaur b10d950
chore: update webui build output
allozaur File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Binary file not shown.
Oops, something went wrong.
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@ngxson maybe it's worth adding info about auto-loading that happens when we simply do a request to
/chat/completionswithmodelparam specified in the payload and how it's behaving if not handled manually with/load//unloadendpoints?