-
Notifications
You must be signed in to change notification settings - Fork 5.6k
[ACTION] Google Sheets - Provide polling options as an alternative to current webhook based ones #19302
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
|
The latest updates on your projects. Learn more about Vercel for GitHub. 2 Skipped Deployments
|
WalkthroughAdds polling-based Google Sheets sources ("New Comment" and "New Updates"), shared processing modules and test fixtures, introduces batched storage for large sheets, and bumps package version to 0.13.0. Changes
Sequence Diagram(s)sequenceDiagram
participant Timer as Polling Timer
participant Source as Polling Source
participant DB as Local DB
participant API as Google Sheets API
participant Emitter as Event Emitter
loop polling interval
Timer->>Source: run()
Source->>DB: read lastTs / read batched snapshot
DB-->>Source: lastTs / oldValues
Source->>API: listComments(...) or getSpreadsheet/getValues(...)
API-->>Source: comments or currentValues
alt New comments or diffs found
Source->>DB: write lastTs and/or write batched values
loop For each new item
Source->>Source: build meta / compute diff
Source->>Emitter: $emit(item, meta)
end
else No changes
Source->>Source: early exit
end
end
Estimated code review effort🎯 4 (Complex) | ⏱️ ~45 minutes
Pre-merge checks and finishing touches✅ Passed checks (5 passed)
✨ Finishing touches
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
lcaresia
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM!
For Integration QA: |
|
Hello everyone, I have tested this PR and there're some test cases failed or needed improvement. Please check test reports below for more information:
|
|
Hi @vunguyenhung can you please share the steps with Google Sheets so I can reproduce the OOM problem, thanks! |
|
Hi @jcortes, will draft the reproduce steps and share with you in a few hours |
d6b8d91 to
0923613
Compare
|
Hi @vunguyenhung I've made some changes to new-updates source, so maybe that can fix the OOM issue so test again with this change please, thanks! |
0923613 to
fd256c9
Compare
|
Hi everyone, all test cases are passed! Ready for release! Test reports
|
|
Hello everyone, I have tested this PR and there're some test cases failed or needed improvement. Please check test reports below for more information:
|
fd256c9 to
93e1cce
Compare
|
Hi @vunguyenhung I've made the changes you requested, please test again, thanks! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 7
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (2)
components/google_sheets/sources/common/new-updates.mjs (2)
188-193: Bug: Incorrect referenceoldValues.lengthshould beoldValues[i].length.On line 192,
oldValues.lengthreturns the total row count instead of the column count for rowi. This is a pre-existing bug but affects the accuracy of change detection when columns are added/removed.else colCount = newValues[i].length > oldValues[i].length ? newValues[i].length - : oldValues.length; + : oldValues[i].length; return colCount;
255-282: Consider emittingcurrentValuesin the event payload for consistency.The
processSpreadsheetmethod emitsworksheetandchangesbut notcurrentValues. Looking at the test fixture intest-event.mjs, the expected event structure includescurrentValues. This inconsistency could cause issues for downstream consumers.this.$emit( { worksheet, + currentValues, changes, }, this.getMeta(spreadsheet, worksheet), );
📜 Review details
Configuration used: CodeRabbit UI
Review profile: ASSERTIVE
Plan: Pro
📒 Files selected for processing (8)
components/google_sheets/package.json(1 hunks)components/google_sheets/sources/common/new-comment.mjs(1 hunks)components/google_sheets/sources/common/new-updates.mjs(7 hunks)components/google_sheets/sources/new-comment-polling/new-comment-polling.mjs(1 hunks)components/google_sheets/sources/new-comment-polling/test-event.mjs(1 hunks)components/google_sheets/sources/new-updates-polling/new-updates-polling.mjs(1 hunks)components/google_sheets/sources/new-updates-polling/test-event.mjs(1 hunks)components/google_sheets/sources/new-updates/new-updates.mjs(1 hunks)
🧰 Additional context used
🧠 Learnings (1)
📚 Learning: 2025-01-23T03:55:15.166Z
Learnt from: GTFalcao
Repo: PipedreamHQ/pipedream PR: 15376
File: components/monday/sources/name-updated/name-updated.mjs:6-6
Timestamp: 2025-01-23T03:55:15.166Z
Learning: Source names in Monday.com components don't need to start with "New" if they emit events for updated items (e.g., "Name Updated", "Column Value Updated") rather than new items. This follows the component guidelines exception where the "New" prefix is only required when emits are limited to new items.
Applied to files:
components/google_sheets/sources/new-updates/new-updates.mjs
🧬 Code graph analysis (2)
components/google_sheets/sources/new-comment-polling/new-comment-polling.mjs (1)
components/google_sheets/actions/create-spreadsheet/create-spreadsheet.mjs (1)
googleSheets(60-66)
components/google_sheets/sources/new-updates-polling/new-updates-polling.mjs (1)
components/google_sheets/sources/common/new-updates.mjs (8)
spreadsheet(248-254)sheetId(196-196)sheetId(210-210)baseId(197-197)baseId(234-234)baseId(281-281)oldValues(198-198)currentValues(199-202)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (4)
- GitHub Check: pnpm publish
- GitHub Check: Lint Code Base
- GitHub Check: Publish TypeScript components
- GitHub Check: Verify TypeScript components
🔇 Additional comments (6)
components/google_sheets/package.json (1)
3-3: LGTM!The minor version bump from 0.12.0 to 0.13.0 is appropriate for introducing new polling source features.
components/google_sheets/sources/new-updates/new-updates.mjs (1)
12-12: LGTM!Patch version bump appropriately reflects the inherited batching improvements from the common module.
components/google_sheets/sources/new-comment-polling/test-event.mjs (1)
1-21: LGTM!The test fixture provides a realistic Google Drive comment structure with all necessary fields for testing the new polling source.
components/google_sheets/sources/new-updates-polling/test-event.mjs (1)
1-71: LGTM!The test fixture accurately represents the event structure emitted by
processSpreadsheet(), including worksheet metadata, current values, and change tracking.components/google_sheets/sources/common/new-comment.mjs (1)
19-34: The code is correct.listCommentsproperly handlesundefinedas the second parameter by omitting thestartModifiedTimefilter when it's not set, allowing the API to return recent comments on first run.components/google_sheets/sources/new-updates-polling/new-updates-polling.mjs (1)
14-62: Props, polling timer, and DB usage look consistent with existing patternsUsing
db: "$.service.db",$.interface.timerwithDEFAULT_POLLING_SOURCE_TIMER_INTERVAL, and an explicitmonitoringRangeprop plus the informationalalertaligns well with the new batched polling design and the performance concerns in large sheets. I don’t see issues here.
components/google_sheets/sources/new-comment-polling/new-comment-polling.mjs
Show resolved
Hide resolved
components/google_sheets/sources/new-comment-polling/new-comment-polling.mjs
Show resolved
Hide resolved
components/google_sheets/sources/new-updates-polling/new-updates-polling.mjs
Show resolved
Hide resolved
components/google_sheets/sources/new-updates-polling/new-updates-polling.mjs
Show resolved
Hide resolved
components/google_sheets/sources/new-updates-polling/new-updates-polling.mjs
Show resolved
Hide resolved
|
Hello everyone, I have tested this PR and there're some test cases failed or needed improvement. Please check test reports below for more information:
|
93e1cce to
251e796
Compare
|
Hi @vunguyenhung please test again with latest push, thanks! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 4
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (2)
components/google_sheets/sources/common/new-updates.mjs (2)
170-193: Bug ingetColCount: usesoldValues.lengthinstead ofoldValues[i].length.In the final
elsebranch (line 188), the function returnsoldValues.length(the row count) instead ofoldValues[i].length(the column count of row i). This causes theforloop ingetContentChangesto iterate an incorrect number of times, leading to missed or extra cells being processed.The fix is to change line 192:
- : oldValues.length; + : oldValues[i].length;
195-207: FixgetContentChangesto handlenulloldValues from missing snapshot.When
_getBatchedSheetValuesreturnsnull(no prior snapshot exists),getContentDiffpreserves that asnull. On first run or when a new worksheet is added after the last snapshot, downstream code crashes:
getContentChangesat line 289 directly accessesoldValues.lengthwithout checking fornull, causing a runtime error.- While
getRowCounthas a guard (if (oldValues && oldValues.length)), it only determines row count; the subsequent call togetContentChangesstill receivesnull.- The guard at line 273 (
if (rowCount === 0 && !oldValues)) only prevents the issue when both conditions hold; it does not protect the case wherecurrentValuesexists butoldValuesisnull.Two straightforward fixes:
- Simpler: treat missing data as an empty array (restores prior behavior):
- const oldValues = this._getBatchedSheetValues(baseId) || null; + const oldValues = this._getBatchedSheetValues(baseId) || [];
- Alternatively: guard before calling
getContentChangesinprocessSpreadsheet:const changes = await this.getContentChanges( currentValues, oldValues, ); + if (!oldValues) { + const baseId = `${spreadsheet.spreadsheetId}${worksheet.properties.sheetId}`; + const rawNewValues = currentValues.values || []; + const newValues = rawNewValues.slice(0, this.getMaxRows()); + this._setBatchedSheetValues(baseId, newValues); + continue; + }Option (1) is simpler and avoids the crash. Either way, this must be addressed.
♻️ Duplicate comments (6)
components/google_sheets/sources/common/new-updates.mjs (2)
91-112: Simplify_getBatchedSheetValuesloop.
hasMoreis redundant since you set it tofalseimmediately beforebreak. You can simplify the loop without changing behavior:- _getBatchedSheetValues(baseId) { - const allValues = []; - let batchIndex = 0; - let hasMore = true; - - while (hasMore) { + _getBatchedSheetValues(baseId) { + const allValues = []; + let batchIndex = 0; + + while (true) { const batchKey = this._getBatchKey(baseId, batchIndex); const batchValues = this._getSheetValues(batchKey); if (!batchValues) { - hasMore = false; break; } allValues.push(...batchValues); batchIndex++; }This improves readability without affecting functionality.
113-135: Avoiddb.set(key, undefined)for clearing batches.The current batch-clear loop relies on:
while (this.db.get(this._getBatchKey(baseId, batchIndex))) { this.db.set(this._getBatchKey(baseId, batchIndex), undefined); batchIndex++; }Using
undefinedas a sentinel is implementation‑dependent and makes it harder to reason about termination semantics. Prefer either a dedicated delete/clear API if available, or an explicit “empty” value (e.g. an empty string) that_getSheetValuescan interpret as “no data”.For example:
- while (this.db.get(this._getBatchKey(baseId, batchIndex))) { - this.db.set(this._getBatchKey(baseId, batchIndex), undefined); + while (this.db.get(this._getBatchKey(baseId, batchIndex))) { + this.db.set(this._getBatchKey(baseId, batchIndex), ""); batchIndex++; }and adjust
_getSheetValuesto treat""as empty if necessary. Please confirm the expected semantics of$.service.db.set(key, undefined)in this codebase.#!/bin/bash # Inspect how $.service.db is typically cleared in this repo rg -n '\.db\.set\([^,]+,\s*undefined\)' -S rg -n '\.db\.(delete|remove|clear)' -Scomponents/google_sheets/sources/new-updates-polling/new-updates-polling.mjs (3)
8-13: Clarify that this is a polling-based source in the metadata.Since this is an alternative to the existing “New Updates (Instant)” webhook trigger, naming it simply
"New Updates"and reusing the same description can be confusing in the UI.Consider making polling explicit, for example:
- name: "New Updates", - description: "Emit new event each time a row or cell is updated in a spreadsheet.", + name: "New Updates (Polling)", + description: "Emit new event each time a row or cell is updated in a spreadsheet using polling.",
115-138: AlignbaseIdconstruction and monitoringRange behavior across snapshot and diff paths.In
takeSheetSnapshotyou build:const sheetId = this.getSheetId(); const baseId = `${sheetId}${worksheetId}`;while in
getContentDiffyou use:const baseId = `${spreadsheet.spreadsheetId}${worksheet.properties.sheetId}`;These should be identical, but that currently relies on
this.getSheetId()andspreadsheet.spreadsheetIdalways matching. To avoid subtle desyncs, standardizebaseIdconstruction everywhere, for example:- const sheetId = this.getSheetId(); - const baseId = `${spreadsheet.spreadsheetId}${worksheet.properties.sheetId}`; + const sheetId = this.getSheetId(); + const baseId = `${sheetId}${worksheet.properties.sheetId}`;and mirror that in
takeSheetSnapshot.Also, when
monitoringRangeis set,getContentDiffiterates over every worksheet but always uses the samerange:const range = this.monitoringRange ? this.monitoringRange : worksheet.properties.title;and
takeSheetSnapshotpicksspreadsheet.sheets[0]as the worksheet for the baseline. With multiple worksheets, this can:
- Associate a baseline from the first sheet with diffs computed against another sheet, or
- Re-fetch and diff the same range for every worksheet, emitting duplicate “changes”.
For monitoringRange‑based usage, consider either:
- Restricting processing to the single worksheet that corresponds to the range (e.g., parse the sheet name from
SheetName!A1:Z1000), or- Documenting and enforcing that monitoringRange is only supported when a single worksheet is selected.
#!/bin/bash # Show all call sites of getContentDiff and takeSheetSnapshot for this source rg -n "google_sheets-new-updates-polling" -S rg -n "takeSheetSnapshot" components/google_sheets/sources -n -SAlso applies to: 140-153
165-167: UsegetSheetId()inrun()for consistency.Elsewhere you already rely on
getSheetId()(from the common/base mixins) to resolve the spreadsheet ID, butrun()calls:const spreadsheet = await this.googleSheets.getSpreadsheet(this.sheetID);For consistency and future‑proofing, consider:
async run() { - const spreadsheet = await this.googleSheets.getSpreadsheet(this.sheetID); + const spreadsheetId = this.getSheetId(); + const spreadsheet = await this.googleSheets.getSpreadsheet(spreadsheetId); return this.processSpreadsheet(spreadsheet); },Confirm that `getSheetId()` in the shared Google Sheets http-based base/common always reflects the same ID expected by `googleSheets.getSpreadsheet(...)`.components/google_sheets/sources/new-comment-polling/new-comment-polling.mjs (1)
1-40: Wire insampleEmitand drop unused...common.props.There’s already a
test-event.mjsalongside this file, but it isn’t used, andcommon/new-comment.mjsdoesn’t currently define anyprops. For better UI preview and cleaner metadata:
- Import and expose a sample event for the UI:
-import { DEFAULT_POLLING_SOURCE_TIMER_INTERVAL } from "@pipedream/platform"; +import { DEFAULT_POLLING_SOURCE_TIMER_INTERVAL } from "@pipedream/platform"; +import sampleEmit from "./test-event.mjs"; @@ export default { ...common, @@ props: { @@ - sheetID: { + sheetID: { propDefinition: [ googleSheets, "sheetID", (c) => ({ driveId: googleSheets.methods.getDriveId(c.watchedDrive), }), ], }, - ...common.props, }, methods: { ...base.methods, ...common.methods, }, + sampleEmit,
- Since
commononly exportsmethodstoday, spreading...common.propsis a no‑op and can be removed as above.#!/bin/bash # Verify that common/new-comment.mjs exports only methods (no props) and that other sources use sampleEmit similarly rg -n "new-comment-polling" components/google_sheets/sources -n -S rg -n "sampleEmit" components/google_sheets/sources -n -S
📜 Review details
Configuration used: CodeRabbit UI
Review profile: ASSERTIVE
Plan: Pro
📒 Files selected for processing (8)
components/google_sheets/package.json(1 hunks)components/google_sheets/sources/common/new-comment.mjs(1 hunks)components/google_sheets/sources/common/new-updates.mjs(7 hunks)components/google_sheets/sources/new-comment-polling/new-comment-polling.mjs(1 hunks)components/google_sheets/sources/new-comment-polling/test-event.mjs(1 hunks)components/google_sheets/sources/new-updates-polling/new-updates-polling.mjs(1 hunks)components/google_sheets/sources/new-updates-polling/test-event.mjs(1 hunks)components/google_sheets/sources/new-updates/new-updates.mjs(1 hunks)
🧰 Additional context used
🧠 Learnings (2)
📚 Learning: 2024-10-10T19:18:27.998Z
Learnt from: GTFalcao
Repo: PipedreamHQ/pipedream PR: 14265
File: components/the_magic_drip/sources/common.mjs:35-43
Timestamp: 2024-10-10T19:18:27.998Z
Learning: In `components/the_magic_drip/sources/common.mjs`, when processing items in `getAndProcessData`, `savedIds` is intentionally updated with IDs of both emitted and non-emitted items to avoid emitting retroactive events upon first deployment and ensure only new events are emitted as they occur.
Applied to files:
components/google_sheets/sources/new-comment-polling/new-comment-polling.mjs
📚 Learning: 2025-01-23T03:55:15.166Z
Learnt from: GTFalcao
Repo: PipedreamHQ/pipedream PR: 15376
File: components/monday/sources/name-updated/name-updated.mjs:6-6
Timestamp: 2025-01-23T03:55:15.166Z
Learning: Source names in Monday.com components don't need to start with "New" if they emit events for updated items (e.g., "Name Updated", "Column Value Updated") rather than new items. This follows the component guidelines exception where the "New" prefix is only required when emits are limited to new items.
Applied to files:
components/google_sheets/sources/new-updates/new-updates.mjscomponents/google_sheets/sources/new-updates-polling/new-updates-polling.mjs
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (4)
- GitHub Check: pnpm publish
- GitHub Check: Lint Code Base
- GitHub Check: Publish TypeScript components
- GitHub Check: Verify TypeScript components
🔇 Additional comments (4)
components/google_sheets/package.json (1)
3-3: Version bump looks appropriate.The 0.13.0 bump aligns with the addition of new polling sources and batching behavior; no issues here.
components/google_sheets/sources/new-updates/new-updates.mjs (1)
10-13: Source version bump is consistent with underlying common changes.Updating to
0.3.4for the instant source matches the new batching logic introduced in the sharedcommon/new-updates.mjswithout altering the public surface.components/google_sheets/sources/new-updates-polling/test-event.mjs (1)
1-70: Test fixture shape matches expected New Updates payload.The
worksheet,currentValues, andchangesfields align with the structure produced byprocessSpreadsheet/getContentChanges(e.g.,cellformatB:4), so this looks good for UI preview and tests.components/google_sheets/sources/new-comment-polling/test-event.mjs (1)
1-21: Comment test fixture is coherent and self-contained.The payload mirrors a realistic
drive#commentobject with the fields used bygenerateMeta(e.g.,id,content,createdTime), so it’s suitable as a sample event.
WHY
Resolves #19119
Summary by CodeRabbit
✏️ Tip: You can customize this high-level summary in your review settings.