-
Notifications
You must be signed in to change notification settings - Fork 235
Fix progress bar stuck during archive import #7118
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix progress bar stuck during archive import #7118
Conversation
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## main #7118 +/- ##
==========================================
+ Coverage 79.58% 79.58% +0.01%
==========================================
Files 566 566
Lines 43520 43523 +3
==========================================
+ Hits 34631 34634 +3
Misses 8889 8889 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
a0a7665 to
583be24
Compare
| # collect the unique entities from the input backend to be added to the output backend | ||
| ufields = [] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I considered also adding a report statement here:
| # collect the unique entities from the input backend to be added to the output backend | |
| ufields = [] | |
| # Additional report statement, as this can take a while | |
| IMPORT_LOGGER.report(f'Finding unique new {etype.value}(s)') | |
As this can also take a while for a large number of nodes, however, it appears for all entities, so also Users and Computers, for which it completes quickly, so the report only adds noise. I decided to remove it for now.
edan-bainglass
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @GeigerJ2. LGTM! Have you tested locally?
The removal of the filter_size batching in #6998 caused the progress bar to only update once at the end instead of incrementally. Here, we re-introduce query batching (50k IDs per batch for UX) and added QueryBuilder streaming-based progress updates (every ~10k entities) by replacing the blocking list comprehension with an explicit for-loop. Progress now updates during both query batching (for UX) and result streaming. Cherry-pick: 4e54cd4
The removal of the filter_size batching in #6998 caused the progress bar to only update once at the end instead of incrementally. Here, we re-introduce query batching (50k IDs per batch for UX) and added QueryBuilder streaming-based progress updates (every ~10k entities) by replacing the blocking list comprehension with an explicit for-loop. Progress now updates during both query batching (for UX) and result streaming. Cherry-pick: 4e54cd4
The removal of the filter_size batching in #6998 caused the progress bar to only update once at the end instead of incrementally. Here, we re-introduce query batching (50k IDs per batch for UX) and added QueryBuilder streaming-based progress updates (every ~10k entities) by replacing the blocking list comprehension with an explicit for-loop. Progress now updates during both query batching (for UX) and result streaming. (cherry picked from commit 4e54cd4)
After removing filter_size batching in the QueryBuilder IN clause optimization in #6998, the
_add_new_entitiesfunction's progress bar was broken.Before PR #6998
Progress updates:
filter_sizebatching only (every ~1000 entities)After PR #6998 (Broken)
With this PR (restored AND improved)
Progress updates: