-
-
Notifications
You must be signed in to change notification settings - Fork 7
feat(Story-3): Add IR Operations for Sigmoid Family (Group 2) #499
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Added IR operation classes for 10 Sigmoid-family activations: - SwishOp, SiLUOp (uses IEngine.Swish) - MishOp (uses IEngine.Mish) - HardSigmoidOp, HardTanhOp, ScaledTanhOp - SoftplusOp, SoftSignOp, BentIdentityOp, IdentityOp All operations include: - Proper null checks for parameters - IEngine integration where available - Forward and backward pass implementations - Comprehensive XML documentation - Numerical stability considerations (softplus) Enables JIT compilation for layers using these activation functions. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>
Summary by CodeRabbit
✏️ Tip: You can customize this high-level summary in your review settings. WalkthroughAdds a new file containing an Changes
Sequence Diagram(s)sequenceDiagram
participant Client
participant ActivationOp as Activation Op
participant IEngine
participant Tensor
Client->>ActivationOp: Forward(input)
ActivationOp->>ActivationOp: Validate input type
ActivationOp->>IEngine: Execute operation (e.g., Swish, Sigmoid)
IEngine->>Tensor: Perform computation
Tensor-->>IEngine: Result tensor
IEngine-->>ActivationOp: Output tensor
ActivationOp-->>Client: Return output
Client->>ActivationOp: Backward(input, gradOutput)
ActivationOp->>ActivationOp: Validate input & gradient types
ActivationOp->>IEngine: Compute derivative
IEngine->>Tensor: Derivative computation
Tensor-->>IEngine: Gradient tensor
IEngine-->>ActivationOp: Input gradient
ActivationOp-->>Client: Return gradient
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~25 minutes
Poem
Pre-merge checks and finishing touches✅ Passed checks (3 passed)
✨ Finishing touches
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 3
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (1)
src/JIT/ActivationOps.cs(1 hunks)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
- GitHub Check: Build All Frameworks
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull request overview
This PR introduces IR (Intermediate Representation) operations for 10 Sigmoid-family activation functions to enable JIT compilation of neural network layers. The operations implement the IROp interface with Forward and Backward methods, leveraging GPU acceleration where available.
Key Changes:
- Defines IROp interface for JIT compilation operations
- Implements 10 activation function operation classes (Swish/SiLU, Mish, HardSigmoid, HardTanh, ScaledTanh, Softplus, SoftSign, BentIdentity, Identity)
- Adds NumOps helper class for generic numeric operations
Comments suppressed due to low confidence (3)
src/JIT/ActivationOps.cs:1
- Using numerical gradient approximation (finite differences) instead of analytical gradient is significantly slower and less accurate. The analytical formula is documented in the comments. Numerical differentiation requires an additional forward pass through Mish and introduces numerical errors.
using System;
src/JIT/ActivationOps.cs:338
- This assignment to three is useless, since its value is never read.
var three = NumOps<T>.FromDouble(3.0);
src/JIT/ActivationOps.cs:339
- This assignment to negThree is useless, since its value is never read.
var negThree = NumOps<T>.FromDouble(-3.0);
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
- Add GradHardSigmoid with proper masking for -3 < x < 3 - Add GradHardTanh with proper masking for minVal < x < maxVal - Add GradSoftPlus with numerically stable implementation - Fix Softplus forward pass: use max(0,x) + log(1+exp(-|x|)) formula - Add comprehensive TensorMatMul/TensorTranspose tests (20 tests) Addresses PR review comments for #499, #500, #503, #504, #508, #509 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>
Story 3: IR Operations - Sigmoid Family
Added 10 IR operation classes for Sigmoid-family activations.
Activations Added:
Implementation Details:
Pattern Followed:
Build Status: ✅ Passed on all target frameworks (net462, net471, netstandard2.0)
Testing:
🤖 Generated with Claude Code