-
Notifications
You must be signed in to change notification settings - Fork 14k
[WIP]ggml-hexagon: create generalized functions for cpu side op #17500
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
…ration for improved flexibility
…operation for improved flexibility
…t calls for clarity
| ggml_op_name(op->op)); | ||
| } | ||
| return false; | ||
| }; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here we check if the buffer is supported by this backend.
See also:
https://github.com/ggml-org/llama.cpp/blob/master/ggml/src/ggml-cuda/ggml-cuda.cu#L3915
| } | ||
|
|
||
| static void ggml_hexagon_add_id(const struct ggml_tensor * op, uint32_t flags) { | ||
| template <bool _IsSrc0Constant> static void ggml_hexagon_binary_id(const struct ggml_tensor * op, uint32_t flags) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can probably refactor these op functions into a single ggml_hexagon_generic function that handles null sources. This would reduce code duplication significantly. WDYT? @max-krasnyansky
…ent in hexagon implementation
…in hexagon operations
This PR refactors the Hexagon backend to generalize CPU-side operations and improve code maintainability. The primary goal is to reduce code duplication by merging similar operation handlers and standardizing buffer management.
Key Changes
Generalized Operation Handlers:
ggml_hexagon_mul_matintoggml_hexagon_binary.ggml_hexagon_mul_mat_idintoggml_hexagon_add_id(renamed toggml_hexagon_binary_id).<bool _IsSrc0Constant>to efficiently handle cases where the first operand (like weights) is constant, optimizing cache maintenance.Refactored Buffer Management:
enum dsp_buffer_typeto explicitly define buffer roles:DSP_BUFFER_TYPE_DSP_WRITE_CPU_READDSP_BUFFER_TYPE_CPU_WRITE_DSP_READDSP_BUFFER_TYPE_CONSTANT