Skip to content

Conversation

@ParidelPooya
Copy link
Contributor

@ParidelPooya ParidelPooya commented Dec 3, 2025

Fix bug where tolerance checks occurred after ALL_COMPLETED check,
causing incorrect completion reasons when failure thresholds were exceeded.

Changes:
- Fix logic ordering in getCompletionReason to check tolerance before completion
- Add failure-threshold-exceeded examples for map and parallel operations
- Create tests validating FAILURE_TOLERANCE_EXCEEDED behavior

…perations

Fix critical bug where tolerance checks occurred after ALL_COMPLETED check,
causing incorrect completion reasons when failure thresholds were exceeded.

Changes:
- Fix logic ordering in getCompletionReason to check tolerance before completion
- Optimize performance by passing failureCount parameter to eliminate redundant calculations
- Add failure-threshold-exceeded examples for map and parallel operations
- Create comprehensive tests validating FAILURE_TOLERANCE_EXCEEDED behavior
- Add context.wait({seconds:1}) after all map/parallel operations in examples

The fix ensures operations correctly return FAILURE_TOLERANCE_EXCEEDED when
thresholds are exceeded, even if all items have completed execution.
@ParidelPooya ParidelPooya force-pushed the feat/completion-config-examples branch from e04787b to ed7e29d Compare December 3, 2025 19:16
@ParidelPooya ParidelPooya changed the title feat(examples): add completion config examples for map and parallel fix(sdk): correct FAILURE_TOLERANCE_EXCEEDED behavior in concurrent operations Dec 3, 2025
…rcentage test

Add debug logging to capture execution status, history events, and result
payload to help diagnose integration test failure in cloud environment.
- Add maxAttempts: 1 to failing steps in parallel/map tolerated failure examples
- Prevents infinite retries that cause execution timeouts in cloud environment
- Allows proper completion evaluation when failure tolerance is met
- Use retryPresets.noRetry instead of invalid retry config object
- Import retryPresets from SDK to disable step retries properly
- Resolves TypeScript compilation errors
- Change '(count)' to 'count' and '(percentage)' to 'percentage'
- Fixes AWS Lambda function name validation error
- Parentheses are not allowed in Lambda function names
- Add retryPresets.noRetry to long-running step (20s sleep)
- Prevents test timeout when step fails and retries with default strategy
- Test was timing out at 30s due to retry delays on top of 20s sleep
- Change from 20s to 15s sleep to prevent cloud timeout
- Test was still timing out at 30s due to cloud environment overhead
- 15s + overhead should complete within 30s timeout
- Add retryPresets.noRetry to all 3 invoke operations in branch 2
- Prevents timeout when invokes fail and retry with default strategy
- Keep 20s sleep in branch 1 since invoke retries were the real issue
…pointing-invoke

- Increase test timeout from 30s to 60s to accommodate cloud environment overhead
- Update duration expectation from 15s to 25s to match 20s sleep + invoke time
- Local test now passes in 23s, should work in cloud with 60s timeout
- Added 4 comprehensive test cases for getCompletionReason function in replayItems method
- Tests cover toleratedFailureCount, toleratedFailurePercentage, and fail-fast scenarios
- Ensures complete coverage of FAILURE_TOLERANCE_EXCEEDED logic for both execution modes
Copy link
Contributor

@anthonyting anthonyting left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we have some more assertions in the integ tests on what is returned from the result? Currently it just asserts the counts, but it's possible that the results aren't returned correctly.

…thod

- Eliminated code duplication by creating a single private getCompletionReason method
- Both replayItems and executeItemsConcurrently now use the same logic
- Reduced code size by ~80 lines while maintaining identical functionality
- All tests pass with improved coverage
…sertions

- Added detailed assertions for each branch result in parallel tolerated failure tests
- Added detailed assertions for each item result in map tolerated failure tests
- Verify specific success/failure patterns match expected behavior
- Ensure proper status, result values, and error handling for each iteration
- All tests pass with comprehensive validation of individual results
- Added detailed individual branch/item assertions for tolerated failure tests
- Enhanced min-successful tests with comprehensive validation of started vs completed items
- Verified all items are tracked (including STARTED state for incomplete items)
- All tests now pass with thorough validation of individual results
- Failure threshold tests kept simple due to complex error payload structure
@ParidelPooya
Copy link
Contributor Author

Can we have some more assertions in the integ tests on what is returned from the result? Currently it just asserts the counts, but it's possible that the results aren't returned correctly.

Added more assertions to tests

…rtions

- Added comprehensive individual result validation for map failure threshold exceeded count test
- Added comprehensive individual result validation for map failure threshold exceeded percentage test
- Verified proper status (SUCCEEDED/FAILED), index mapping, error types, and result values
- Fixed TypeScript type assertions for accessing nested result payload structure
- Tests now validate both overall completion reason and detailed individual item outcomes
…ld exceeded tests

- Replaced incorrect 'retry' property with proper 'retryStrategy' function
- Limited retry attempts to 2 to prevent test timeouts (was previously using default 5 attempts)
- Added 1-second delay between retry attempts for consistent timing
- Tests now complete within 60-second timeout while maintaining correct behavior validation
- Both map failure threshold exceeded count and percentage tests now pass reliably
Comment on lines +20 to +21
const historyEvents = execution.getHistoryEvents();
const mapContext = historyEvents.find(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's better if we used the testing library getOperation().getContextDetails() instead of parsing the history results (same for other tests)

…reshold exceeded tests

- Reduced retry attempts from 2 to 1 (total attempts from 3 to 2) for faster test execution
- Tests still demonstrate failure threshold exceeded behavior but run significantly faster
- Note: Tests currently fail because map operation stops early when threshold exceeded,
  preventing successful items from completing - this is expected behavior that needs
  test expectation adjustment
- Replace custom retry strategies with retryPresets.noRetry in map failure threshold exceeded tests
- Eliminates unnecessary retry delays that were causing cloud integration test timeouts
- Maintains test behavior while significantly reducing execution time
- Replace retryPresets.noRetry with createRetryStrategy({ maxAttempts: 1 })
- Ensures proper retry strategy configuration for immediate failure
- Maintains fast execution while using the correct retry helper function
- Add wait operation after map in count test to match percentage test structure
- Increase test timeout to 15 seconds to accommodate retry delays with skipTime: false
- Both tests now pass successfully with proper FAILURE_TOLERANCE_EXCEEDED behavior
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants