Skip to content

Conversation

@ooples
Copy link
Owner

@ooples ooples commented Nov 23, 2025

Story 2: IR Operations - ReLU Family

Added 8 IR operation classes for ReLU-family activations plus 2 additional common activations.

Activations Added:

  • ReLU - Rectified Linear Unit (max(0, x))
  • GELU - Gaussian Error Linear Unit (used in transformers)
  • ELU - Exponential Linear Unit (parameterized with alpha)
  • SELU - Scaled ELU with self-normalizing constants
  • CELU - Continuously Differentiable ELU
  • LeakyReLU - ReLU with small negative slope (default 0.01)
  • PReLU - Parametric ReLU with learnable parameters
  • RReLU - Randomized ReLU for regularization
  • ThresholdedReLU - Sparse activation above threshold
  • Sigmoid - Binary classification activation
  • Tanh - Hyperbolic tangent activation

Pattern:

  • Each class implements IROp interface
  • Forward() uses IEngine methods where available (GELU, ELU, ReLU, Sigmoid, Tanh)
  • Advanced variants (CELU, LeakyReLU, PReLU, RReLU, ThresholdedReLU) marked for future implementation
  • Backward() placeholder for gradient support
  • Proper null checks and XML documentation
  • Comprehensive parameter validation

Files Modified:

  • src/Interfaces/IJitCompilable.cs - New IROp marker interface for JIT-compilable operations
  • src/JIT/ActivationOps.cs - New file with all activation IR operations

Build Status: ✅ Build succeeded with 0 warnings, 0 errors

🤖 Generated with Claude Code

Added IROp interface and IR operation classes for 8 ReLU-family activations:
- ReLUOp (uses IEngine.ReLU)
- GeluOp (uses IEngine.GELU)
- EluOp (uses IEngine.ELU with alpha parameter)
- SeluOp (with α and λ constants for self-normalizing networks)
- CeluOp (continuously differentiable ELU with alpha parameter)
- LeakyReLUOp (parameterized negative slope, default 0.01)
- PReLUOp (learnable per-channel parameters)
- RReLUOp (randomized slope for regularization)
- ThresholdedReLUOp (sparse activations above threshold)

Also included SigmoidOp and TanhOp for completeness.

These enable JIT compilation for layers using these activation functions.
Forward() methods use IEngine where available, others marked for future implementation.
Backward() methods are placeholders for gradient support.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
@coderabbitai
Copy link
Contributor

coderabbitai bot commented Nov 23, 2025

Summary by CodeRabbit

  • New Features
    • Added 12 activation functions (ReLU, GELU, ELU, SELU, CELU, Leaky ReLU, PReLU, RReLU, Thresholded ReLU, Sigmoid, Tanh) with JIT compilation support.
    • Forward pass computation enabled with configurable hyperparameters and input validation.
    • Backward pass (gradient computation) implementation pending.

✏️ Tip: You can customize this high-level summary in your review settings.

Walkthrough

Adds a public marker interface IROp and implements eleven generic JIT-able activation operator classes (ReLU, GELU, ELU, SELU, CELU, LeakyReLU, PReLU, RReLU, ThresholdedReLU, Sigmoid, Tanh) whose forward methods delegate to an IEngine; backward methods are currently unimplemented.

Changes

Cohort / File(s) Summary
Marker Interface
src/Interfaces/IJitCompilable.cs
Added public marker interface IROp in AiDotNet.Interfaces with XML docs and a using AiDotNet.LinearAlgebra; directive to mark IR operations eligible for JIT compilation.
Activation Operations
src/JIT/ActivationOps.cs
Added eleven generic activation operator classes (ReLUOp<T>, GeluOp<T>, EluOp<T>, SeluOp<T>, CeluOp<T>, LeakyReLUOp<T>, PReLUOp<T>, RReLUOp<T>, ThresholdedReLUOp<T>, SigmoidOp<T>, TanhOp<T>) implementing IROp. Each accepts an IEngine and optional hyperparameters; forward methods call engine routines and validate inputs; backward methods throw NotImplementedException.

Sequence Diagram

sequenceDiagram
    participant Client
    participant ActivationOp as Activation Operator
    participant Engine as IEngine
    participant Tensor

    Client->>ActivationOp: Forward(input Tensor)
    ActivationOp->>ActivationOp: validate input
    ActivationOp->>Engine: call activation (e.g., ReLU/GELU/ELU...)
    Engine->>Tensor: compute output Tensor
    Tensor-->>ActivationOp: return output
    ActivationOp-->>Client: return output Tensor

    Note right of ActivationOp: Backward currently throws NotImplementedException
Loading

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

  • Pay attention to input validation consistency across ops.
  • Verify correct engine method names and expected tensor semantics.
  • Review hyperparameter handling (defaults, ranges) and any scalar post-processing (e.g., SELU scaling).
  • Confirm visibility (public) and marker interface placement/namespace.

"I hopped through code with eager paws,
New ops and marks without a pause,
Engines hum, forwards run,
Backwards wait for future fun 🐇"

Pre-merge checks and finishing touches

❌ Failed checks (1 warning)
Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%. You can run @coderabbitai generate docstrings to improve docstring coverage.
✅ Passed checks (2 passed)
Check name Status Explanation
Title check ✅ Passed The PR title clearly and specifically describes the main change: adding IR operations for ReLU family activations as part of Story 2.
Description check ✅ Passed The PR description is directly related to the changeset, providing detailed information about the 11 activation functions added, the IROp interface pattern, implementation details, and the files modified.
✨ Finishing touches
  • 📝 Generate docstrings
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch feat/activation-ir-ops-group1

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (4)
src/Interfaces/IJitCompilable.cs (1)

1-1: Remove unused using directive.

The using AiDotNet.LinearAlgebra; directive is not used in this file since the interface is a marker with no members.

-using AiDotNet.LinearAlgebra;
-
 namespace AiDotNet.Interfaces;
src/JIT/ActivationOps.cs (3)

103-109: Consider validating the alpha parameter.

The alpha parameter accepts any double value without validation. While negative or zero values won't cause runtime errors, they may produce unexpected behavior:

  • Negative alpha inverts the ELU curve for negative inputs
  • Zero alpha makes ELU equivalent to ReLU

If these edge cases are intentional, consider documenting them. Otherwise, add validation.

 public EluOp(IEngine engine, double alpha = 1.0)
 {
     if (engine == null)
         throw new ArgumentNullException(nameof(engine));
+    if (alpha <= 0)
+        throw new ArgumentException("Alpha must be positive", nameof(alpha));
     _engine = engine;
     _alpha = alpha;
 }

277-277: Make the field readonly.

The _alpha field is only assigned in the constructor and could be marked readonly to indicate immutability.

-    private Tensor<T> _alpha;
+    private readonly Tensor<T> _alpha;

372-378: Consider documenting or validating the threshold parameter.

The threshold parameter accepts any double value without validation. While negative thresholds won't cause runtime errors, they may produce unexpected behavior (e.g., negative threshold would allow activation for almost all values). If this is intentional, consider documenting the expected range.

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between ef6df16 and b694abd.

📒 Files selected for processing (2)
  • src/Interfaces/IJitCompilable.cs (1 hunks)
  • src/JIT/ActivationOps.cs (1 hunks)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
  • GitHub Check: Build All Frameworks
🔇 Additional comments (1)
src/JIT/ActivationOps.cs (1)

200-207: Forward method implementations deferred as expected.

Several activation operations (CELU, LeakyReLU, PReLU, RReLU, ThresholdedReLU) have their Forward methods throwing NotImplementedException with clear explanations about missing IEngine capabilities. This is consistent with the PR description indicating these are marked for future implementation.

The error messages clearly communicate what's needed:

  • Element-wise conditional operations
  • Broadcasting operations
  • Random number generation

This provides a clear roadmap for future work.

Also applies to: 245-251, 289-295, 338-344, 380-386

Replace unsafe (T)(object)Lambda cast with proper _numOps.FromDouble(Lambda)
pattern using INumericOperations<T> to safely convert double constants to
generic type T.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
Copilot AI review requested due to automatic review settings December 1, 2025 02:09
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (2)
src/JIT/ActivationOps.cs (2)

232-399: Advanced ReLU variants: parameter validation is good; stubs are acceptable but consider usage guards

LeakyReLU, PReLU, RReLU, and ThresholdedReLU all validate constructor inputs (e.g., non-negative slopes, valid bounds, non-null alpha) and then throw NotImplementedException in Forward/Backward. That’s fine as long as no current JIT path emits these ops yet, but it will surface as runtime failures if they accidentally enter a graph.

Two small improvements you might consider (low priority):

  • Make it explicit in higher-level IR/JIT builders that these ops are not yet supported, so they’re never constructed at runtime.
  • Optionally add a brief XML <remarks> note per class stating that only the IR “shape” is defined and execution will throw until engine support is added.

No code changes are strictly required for this PR, but clarifying their “stub” status would reduce surprises later.


411-479: Sigmoid/Tanh ops correctly delegate to engine and follow the shared pattern

SigmoidOp and TanhOp mirror the pattern of the other simple activations: ctor null-checks, Forward delegating to _engine.Sigmoid/_engine.Tanh, and explicit NotImplementedException for Backward. This is clean and consistent.

If the number of such ops grows further, you might later extract a small base class to share the IEngine field and argument null-checking, but that’s purely a convenience refactor.

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between b694abd and 47ce0c5.

📒 Files selected for processing (1)
  • src/JIT/ActivationOps.cs (1 hunks)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (3)
  • GitHub Check: CodeQL analysis (csharp)
  • GitHub Check: Agent
  • GitHub Check: Build All Frameworks
🔇 Additional comments (2)
src/JIT/ActivationOps.cs (2)

18-87: Simple ReLU/GELU IR ops look correct and consistent

Constructor null-checks, generic constraints, and Forward delegations to _engine.ReLU/_engine.GELU are straightforward and consistent with the intended IROp pattern; Backward stubs clearly signal incomplete gradient support without hiding failures.


99-175: SELU implementation and numeric conversion now look sound

Using MathHelper.GetNumericOperations<T>() plus _numOps.FromDouble(Lambda) to scale the ELU result avoids the earlier boxed cast issue and should be safe for supported numeric T types, assuming MathHelper only returns non-null operations for valid tensor element types.

Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This pull request introduces an Intermediate Representation (IR) layer for activation functions, adding 11 new operation classes for ReLU-family and common activations. The IR operations serve as a bridge between high-level neural network layers and low-level execution engines, with support for JIT compilation.

Key Changes:

  • Added IROp marker interface for JIT-compilable operations
  • Implemented 11 activation operation classes (ReLU, GELU, ELU, SELU, CELU, LeakyReLU, PReLU, RReLU, ThresholdedReLU, Sigmoid, Tanh)
  • Five operations have fully implemented forward passes using existing IEngine methods; six are marked for future implementation

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 10 comments.

File Description
src/Interfaces/IJitCompilable.cs Introduces the IROp marker interface for operations that can be JIT-compiled
src/JIT/ActivationOps.cs Implements 11 activation operation classes with forward/backward method stubs and comprehensive XML documentation
Comments suppressed due to low confidence (1)

src/JIT/ActivationOps.cs:279

  • Field '_alpha' can be 'readonly'.
    private Tensor<T> _alpha;

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

@ooples ooples closed this Dec 1, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants