Skip to content

Conversation

@mstange
Copy link
Contributor

@mstange mstange commented Jun 4, 2025

Sharing the string table is happening in #5481.
This PR makes us share the rest, and completes issue #3918.

Only samples and markers are still per-thread. I don't expect this to change.
This includes allocation samples.

To do:

  • Fix flow and test failures
  • Upgrade the URL when upgrading the profile, to remap function indexes etc. in the URL
  • When compacting during upload, apply translation tables to redux state (e.g. update focus function indexes in URL)
  • Do something to address perf impact for computing the derived thread (transforms and call node table computation happen per thread and take longer if the tables are bigger because they now contain information for all threads, should be able to reuse more of the computed outputs in the thread derivation pipeline)
  • Add a paragraph to CHANGELOG-formats.md for the new version.

mstange added 5 commits June 4, 2025 19:13
…-func transforms working.

This catches a case which the first approach in firefox-devtools#5387
would have broken - it would have compacted the funcTable without adjusting any IndexIntoFuncTable
values in the redux state (specifically it wouldn't have adjusted the func index in the focus-function
transform).
This is a trade-off.

There are some existing uploaded profiles that are very large.
Merging their frameTable and stackTable can take a long time.

Advantages of merging:
 - If reuploaded, the reuploaded profile would be smaller.
 - Creating the call node table would be faster.

Disadvantages of merging:
 - Long load time any time you open an existing large profile.
@vlasakm
Copy link

vlasakm commented Aug 10, 2025

This is really great! Sharing this data across threads works especially well for applications where there is a large number of threads, but just a few "thread pools" whose threads are all executing the same code.

Using this allowed me to employ better caching when creating a firefox profile, save on the profile size, etc.

It also made me greedy, and I attempted to visualize a rather large profile (1602 threads, 577108 stacks, 29760 frames). Previously my experience was that Firefox Profiler would choke on say more than 300+ threads, and I hoped this change with sharing could improve the situation. Now it loads, and the UI is still quite responsive, but the load of the profile is taking long 2 minutes 15 seconds (and Firefox prompts to kill the script, etc.).

90 % of the load time is spent is in computeCallNodeTable, which I believe may be called repeatedly for each thread (and the merged thread which includes all of them), even though all these calls would process the same stack table and frame table, now that they are shared.

I know very little about React/Redux, but I see that there's some concept of memoization and thread selectors, that is still used for call nodes, even though after this change to share the stack and frame table. Very naively, would it perhaps be possible to just compute and memoize the call node information once by not using the thread selector mechanism here? But there's a great chance I am missing all the needed context.

FWIW I was and am still using this branch for quite some time. I am not sure if I am able to contribute much (I even failed to rebase it on top of more recent changes), but it would be great to see this, perhaps after all the big work-in-progress changes land (like the Typescript migration).

@mstange
Copy link
Contributor Author

mstange commented Aug 10, 2025

Yes indeed, sharing the computed call node table across threads while properly handling per thread transforms is the next thing I want to work on when I get back to this task. Thanks for the encouragement and it's great to hear that you're seeing real improvements from it!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants