-
Notifications
You must be signed in to change notification settings - Fork 828
Update names field in source maps #8068
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
This scans the code after optimizations and removes unused function names from 'names' field in the source map, reducing its size. Emscripten has not been generating 'names' field so far, but after #???, it will generate the field in case `llvm-dwarfdump` supports a new option `--filter-child-tag`.
This adds support for `names` field in source maps, which contains function names. Source map mappings are correspondingly updated and emsymbolizer now can provide function name information only with source maps. While source maps don't provide the full inlined hierarchies, this provides the name of the original (= pre-inlining) function, which may not exist in the final binary because they were inlined. This is because source maps are primarily intended for user debugging. This also demangles C++ function names using `llvm-cxxfilt`, so the printed names can be human-readable. I tested with `wasm-opt.wasm` from Binaryen by `if (EMSCRIPTEN)` setup here: https://github.com/WebAssembly/binaryen/blob/95b2cf0a4ab2386f099568c5c61a02163770af32/CMakeLists.txt#L311-L372 with `-g -gsource-map`. With this PR and WebAssembly/binaryen#8068, the source map file size increases by 3.5x (8632423 -> 30070042) primarily due to the function name strings. From `llvm-dwarfdump` output, this also requires additional parsing of `DW_TAG_subprogram` and `DW_TAG_inlined_subroutine` tags which can be at any depths (because functions can be within nested namespaces or classes), so we cannot use `--recurse-depth=0` (emscripten-core#9580) anymore. In case of `wasm-opt.wasm` built with DWARF info, without `--recurse-depth=0` in the command line, the size of its text output increased by 27.5x, but with the `--filter-child-tag` / `-t` option (llvm/llvm-project#165720), the text output increased only (?) by 3.2x, which I think is tolerable. This disables `names` field generation when `-t` option is not available in `llvm-dwarfdump` because it was added recently. To avoid this text size problem, we can consider using DWARF-parsing Python libraries like https://github.com/eliben/pyelftools, but this will make another third party dependency, so I'm not sure if it's worth it at this point. This also increased running time of `wasm-sourcemap.py`, in case of the `wasm-opt.wasm`, by 2.3x (6.6s -> 15.4s), but compared to the linking time this was not very noticeable. Fixes emscripten-core#20715 and closes emscripten-core#25116.
|
Ping 😄 |
kripken
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm % comments
src/wasm/wasm-binary.cpp
Outdated
|
|
||
| // Create the new list of names and the mapping from old to new indices. | ||
| uint32_t newIndex = 0; | ||
| for (auto& pair : oldToNewIndex) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is on an unordered map, so the indexing may end up nondeterministic, I worry?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Changed oldToNewIndex to std::map.
| ;;@ src.cpp:2:1:used | ||
| (nop) | ||
| ) | ||
| ) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Perhaps add another used symbol, to get some coverage of the order of the symbols being deterministic? (not great coverage, but it might help)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done: a55a844
src/wasm/wasm-binary.cpp
Outdated
|
|
||
| // Collect all used symbol name indexes. | ||
| for (auto& func : wasm->functions) { | ||
| for (auto& pair : func->debugLocations) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
could you even do something like [_, location] here instead of having to use pair.second?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
Co-authored-by: Alon Zakai <alonzakai@gmail.com>
Co-authored-by: Derek Schuff <dschuff@chromium.org>
|
CI error might be fixed eventually by #8094 |
This scans the code after optimizations and removes unused function names from 'names' field in the source map, reducing its size. Emscripten has not been generating 'names' field so far, but after emscripten-core/emscripten#25870, it will generate the field in case
llvm-dwarfdumpsupports a new option--filter-child-tag.