-
Notifications
You must be signed in to change notification settings - Fork 212
refactor: improve Nx.Defn.Evaluator debugging usability #1644
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
+421
−116
Merged
Changes from 4 commits
Commits
Show all changes
10 commits
Select commit
Hold shift + click to select a range
0005bc8
changed evaluator to output exs files
Chapaman a6ebf4a
made args be pure documentation
Chapaman ece5b6c
changed evaluator.ex to produce .exs files that can be executed for c…
Chapaman de648bc
Now the 'Verifying Executability' section is within 'Examining the Ou…
Chapaman f5ecdc3
Update nx/guides/advanced/backend_comparison.livemd
polvalente 9898691
Update nx/guides/advanced/backend_comparison.livemd
polvalente d4215ee
Update nx/guides/advanced/backend_comparison.livemd
polvalente 8c7f119
Update nx/lib/nx/defn/evaluator.ex
polvalente ef98d76
fix: tests and ref serialization
polvalente 98381a8
fix: inspect argument ids
polvalente File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,255 @@ | ||
| # Backend Comparison with Evaluator | ||
|
|
||
| ```elixir | ||
| Mix.install([ | ||
| # {:nx, "~> 0.7"} | ||
| {:nx, path: Path.join(__DIR__, "../..")}, | ||
| {:mimic, "~> 1.7"} | ||
| ]) | ||
| ``` | ||
|
|
||
| ## Introduction | ||
|
|
||
| This guide demonstrates how to use `Nx.Defn.Evaluator` to compare the outputs of different backends. This is particularly useful for: | ||
|
|
||
| * **Testing backend implementations** - Ensure different backends produce consistent results | ||
| * **Debugging numerical differences** - Identify where backends diverge | ||
| * **Validating optimizations** - Confirm that optimized backends match reference implementations | ||
|
|
||
| The evaluator's `debug_options` feature saves each node's computation as an executable `.exs` file, making it easy to reconstruct and compare tensors across backends. | ||
|
|
||
| ## How It Works | ||
|
|
||
| When you enable `debug_options` with a `save_path`, the evaluator: | ||
|
|
||
| 1. Saves each computation node as a separate `.exs` file | ||
| 2. Serializes tensors as executable `Nx.from_binary()` calls | ||
| 3. Preserves backend information, shape, type, and names | ||
| 4. Creates files that can be directly executed to reconstruct tensors | ||
|
|
||
| This allows you to: | ||
|
|
||
| * Run the same computation with different backends | ||
| * Compare corresponding node outputs — in this guide we'll be using `Nx.all_close/2` | ||
| * Identify exactly where backends differ | ||
|
|
||
| ## Simulating Backend Differences with Mimic | ||
|
|
||
| Instead of shipping a dedicated mock backend, we can use `Mimic.stub/3` to override individual callbacks on `Nx.BinaryBackend`. First we initialize `Mimic` and copy the binary backend so it can be stubbed safely. Then `add`, `multiply`, and `divide` are swapped to make the divergence easy to spot | ||
|
|
||
| ```elixir | ||
|
|
||
| Mimic.copy(Nx.BinaryBackend) | ||
|
|
||
| defmodule BackendSwaps do | ||
| def enable! do | ||
| Mimic.stub(Nx.BinaryBackend, :add, fn out, left, right -> | ||
| Nx.BinaryBackend.subtract(out, left, right) | ||
| end) | ||
|
|
||
| Mimic.stub(Nx.BinaryBackend, :multiply, fn out, left, right -> | ||
| Nx.BinaryBackend.add(out, left, right) | ||
| end) | ||
|
|
||
| Mimic.stub(Nx.BinaryBackend, :divide, fn out, left, right -> | ||
| Nx.BinaryBackend.add(out, left, right) | ||
| end) | ||
| end | ||
|
|
||
| def restore! do | ||
| for fun <- [:add, :multiply, :divide] do | ||
| Mimic.stub(Nx.BinaryBackend, fun, fn out, left, right -> | ||
| Mimic.call_original(Nx.BinaryBackend, fun, [out, left, right]) | ||
| end) | ||
| end | ||
| end | ||
| end | ||
| ``` | ||
|
|
||
| ## Example: Simple Computation | ||
|
|
||
| Let's define a simple computation to compare across backends: | ||
|
|
||
| ```elixir | ||
| defmodule SimpleComputation do | ||
| import Nx.Defn | ||
|
|
||
| defn compute(x, y) do | ||
| a = Nx.add(x, y) | ||
| b = Nx.multiply(a, 2) | ||
| Nx.divide(b, 3) | ||
| end | ||
| end | ||
| ``` | ||
|
|
||
| ### Prepare Test Data | ||
|
|
||
| ```elixir | ||
| # Create some test input | ||
| x = Nx.tensor([1.0, 2.0, 3.0, 4.0]) | ||
| y = Nx.tensor([0.5, 1.5, 2.5, 3.5]) | ||
|
|
||
| IO.puts("Input tensors:") | ||
| IO.inspect(x, label: "x") | ||
| IO.inspect(y, label: "y") | ||
| ``` | ||
|
|
||
| ## Preparing the function for comparing | ||
|
|
||
| In order to ensure the same `id` for each node in the graph while our function traverses it on both backends, we need to use `Nx.Defn.debug_expr/1` to pre-compile `SimpleComputation.compute/2`. | ||
|
|
||
| This is a trick to make sure the same expression is passed on both `Nx.Defn.jit/2` calls and should not be used liberally. | ||
polvalente marked this conversation as resolved.
Outdated
Show resolved
Hide resolved
|
||
|
|
||
| ```elixir | ||
| expr = Nx.Defn.debug_expr(&SimpleComputation.compute/2).(x, y) | ||
|
|
||
| precompiled = fn _x, _y -> expr end | ||
| ``` | ||
|
|
||
| ## Running with Backend A | ||
|
|
||
| Let's run our computation with the first backend (BinaryBackend in this example, but could be any backend): | ||
|
|
||
| ```elixir | ||
| # Clean up and create output directory | ||
| File.rm_rf!("/tmp/backend_a") | ||
| File.mkdir_p!("/tmp/backend_a") | ||
|
|
||
| # Run computation with debug output enabled | ||
| result_a = Nx.Defn.jit( | ||
| precompiled, | ||
| compiler: Nx.Defn.Evaluator, | ||
| debug_options: [save_path: "/tmp/backend_a"] | ||
| ).(x, y) | ||
|
|
||
| IO.puts("\n✅ Backend A completed") | ||
| IO.inspect(result_a, label: "Result A") | ||
| IO.puts("Backend: #{inspect(result_a.data.__struct__)}") | ||
|
|
||
| # Show what files were generated | ||
| files_a = File.ls!("/tmp/backend_a") | ||
| IO.puts("\nGenerated #{length(files_a)} node files:") | ||
| Enum.each(files_a, &IO.puts(" - #{&1}")) | ||
| ``` | ||
|
|
||
| ## Examining the Output Files | ||
|
|
||
| Let's look at what the `.exs` files contain: | ||
|
|
||
| ```elixir | ||
| # Read and display one of the generated files | ||
| example_file = File.ls!("/tmp/backend_a") |> List.last() | ||
| content = File.read!(Path.join("/tmp/backend_a", example_file)) | ||
|
|
||
| IO.puts("=== Content of #{example_file} ===\n") | ||
| IO.puts(content) | ||
| ``` | ||
|
|
||
| Notice the format: | ||
|
|
||
| * **Node ID** - Unique identifier for this computation node | ||
| * **Operation** - The operation being performed (e.g., `:add`, `:multiply`, `:parameter`) | ||
| * **Arguments** - List containing parameters and tensors as strings | ||
| * **Result** - Executable code that reconstructs the output tensor from binary | ||
|
|
||
| ### Verifying Executability | ||
|
|
||
| Each `.exs` file is a self-contained Elixir script, so you can execute it directly: | ||
|
|
||
| ```elixir | ||
| example_path = Path.join("/tmp/backend_a", example_file) | ||
| Code.eval_file(example_path) | ||
|
|
||
| ``` | ||
|
|
||
| ## Running with Backend B | ||
|
|
||
| Now let's run the same computation with the swapped operations. We leave `Nx` on its default backend, but temporarily enable the Mimic stubs so the evaluator will capture the modified behaviour. | ||
polvalente marked this conversation as resolved.
Show resolved
Hide resolved
|
||
|
|
||
| ```elixir | ||
| # Clean up and create output directory for backend B | ||
| File.rm_rf!("/tmp/backend_b") | ||
| File.mkdir_p!("/tmp/backend_b") | ||
|
|
||
| BackendSwaps.enable!() | ||
|
|
||
| result_b = | ||
| Nx.Defn.jit( | ||
| precompiled, | ||
| compiler: Nx.Defn.Evaluator, | ||
| debug_options: [save_path: "/tmp/backend_b"] | ||
| ).(x, y) | ||
|
|
||
| IO.puts("✅ Backend B completed") | ||
| IO.inspect(result_b, label: "Result B") | ||
| IO.puts("Backend: #{inspect(result_b.data.__struct__)}") | ||
|
|
||
| files_b = File.ls!("/tmp/backend_b") | ||
| IO.puts("\nGenerated #{length(files_b)} node files") | ||
|
|
||
| BackendSwaps.restore!() | ||
| ``` | ||
|
|
||
| ## Comparing the Outputs | ||
|
|
||
| Now we inspect the generated `.exs` files, compare every node, and then summarise matches and mismatches. | ||
|
|
||
| ```elixir | ||
| IO.puts("Comparing outputs from .exs files") | ||
| IO.puts(String.duplicate("-", 60)) | ||
|
|
||
| files_a = File.ls!("/tmp/backend_a") |> Enum.sort() | ||
| files_b = File.ls!("/tmp/backend_b") |> Enum.sort() | ||
|
|
||
| IO.puts("Backend A generated #{length(files_a)} files") | ||
| IO.puts("Backend B generated #{length(files_b)} files") | ||
|
|
||
| comparison = | ||
| Enum.zip_with(files_a, files_b, fn file_a, file_b -> | ||
| {tensor_a, bindings_a} = Code.eval_file(Path.join("/tmp/backend_a", file_a)) | ||
| {tensor_b, _bindings_b} = Code.eval_file(Path.join("/tmp/backend_b", file_b)) | ||
|
|
||
| op = Keyword.get(bindings_a, :operation) | ||
| match? = Nx.all_close(tensor_a, tensor_b, atol: 1.0e-6) |> Nx.to_number() == 1 | ||
|
|
||
| %{ | ||
| operation: op, | ||
| tensor_a: tensor_a, | ||
| tensor_b: tensor_b, | ||
| match?: match?, | ||
| file_a: file_a, | ||
| file_b: file_b | ||
| } | ||
| end) | ||
|
|
||
| {matches, mismatches} = Enum.split_with(comparison, & &1.match?) | ||
|
|
||
| IO.puts("\n Summary:") | ||
| IO.puts(String.duplicate("-", 60)) | ||
|
|
||
| if Enum.any?(matches) do | ||
| IO.puts("✅ Matched nodes (#{length(matches)}):") | ||
|
|
||
| Enum.each(matches, fn match -> | ||
| IO.puts(" - #{match.operation} (#{match.file_a})") | ||
| end) | ||
| else | ||
| IO.puts("\n❌ No nodes match!") | ||
| end | ||
|
|
||
| if Enum.any?(mismatches) do | ||
| IO.puts("\n❌ Mismatched nodes (#{length(mismatches)}):") | ||
|
|
||
| Enum.each(mismatches, fn mismatch -> | ||
| IO.puts("- #{mismatch.operation} (#{mismatch.file_a})") | ||
| IO.puts("Backend A") | ||
| IO.inspect(mismatch.tensor_a) | ||
| IO.puts("Backend B") | ||
| IO.inspect(mismatch.tensor_b) | ||
| end) | ||
| else | ||
| IO.puts("\n✅ All nodes match!") | ||
| end | ||
| ``` | ||
|
|
||
| With Mimic stubs in place, the evaluator’s debug artifacts clearly show where the divergence starts, making it straightforward to pinpoint inconsistent nodes between implementations, while the summary highlights both the matching and mismatching nodes. | ||
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Uh oh!
There was an error while loading. Please reload this page.