-
-
Notifications
You must be signed in to change notification settings - Fork 197
Description
Environment
- Dynamoid: 3.9.0
- Ruby: 3.3.3
- Rails: 6.1.7.3
- RSpec: 3.13.1 / rspec-rails 6.1.5
- parallel_tests: 3.11.1
- Testing: RSpec +
parallel_testgem - DynamoDB: Local instance on port 6000 (amazon/dynamodb-local:2.0.0)
The Problem
We're intermittent ResourceNotFoundException: Cannot do operations on a non-existent table errors when running tests in parallel. Interestingly, the error disappears when we reduce parallel_test concurrency to 1 - so it's seems related to parallel execution.
We've implemented table namespacing per worker and reset logic, but still hit this randomly. Currently, we're just skipping the flaky tests, but we would love to actually fix this.
Our Setup
DynamoDB CI Container Configuration:
# .github/workflows/rspec.yml
dynamodb:
image: amazon/dynamodb-local:2.0.0
options: >-
--health-cmd "curl http://localhost:8000"
--health-interval 10s
--health-timeout 5s
--health-retries 5
ports:
# Expose 6000 for Dynamo. Questionnaires conflicts on 8000.
- 127.0.0.1:6000:8000Namespacing per parallel worker:
# config/initializers/dynamoid.rb
Dynamoid.configure do |config|
# ...
config.endpoint = Rails.configuration.dynamodb_url
config.namespace = if ENV["TEST_ENV_NUMBER"]
"#{Rails.configuration.dynamodb_namespace}_#{ENV["TEST_ENV_NUMBER"]}"
else
Rails.configuration.dynamodb_namespace
end
endReset before each test:
# spec/support/dynamoid_reset.rb
RSpec.configure do |config|
config.before(:each, dynamodb: true) do
SecureframeDev::DynamoidTasks.reset if VCR.turned_on?
end
endReset implementation:
# lib/secureframe_dev/dynamoid_tasks.rb
def self.drop_tables
unless Rails.env.production?
Dynamoid.adapter.list_tables.each do |table|
# Only delete tables in our namespace
if /^#{Dynamoid::Config.namespace}/.match?(table)
begin
Dynamoid.adapter.delete_table(table)
rescue Aws::DynamoDB::Errors::ResourceNotFoundException
# rubocop:disable Rails/EnvironmentVariableAccess
puts <<~DEBUG
Failed deleting '#{table}': table not found...
table: '#{table}'
namespace: '#{Dynamoid::Config.namespace}
test num: '#{ENV["TEST_ENV_NUMBER"]}'
endpoint: '#{Dynamoid::Config.endpoint}'
caller:
#{caller.map { |l| " #{l}" }.join("\n")}
skipping dynamodb table deletion...
DEBUG
# rubocop:enable Rails/EnvironmentVariableAccess
end
end
end
Dynamoid.adapter.tables.clear
end
end
def self.init
# ...
load_models
results = create_tables
results[:created].each do |table_name|
wait_for_table_readiness(table_name) # Waits up to 5 seconds
end
end
def self.reset
# ...
drop_tables
init
endMitigations in Place
- Namespacing tables per worker (
TEST_ENV_NUMBER) - Waiting for table readiness after creation
- Graceful error handling during drops
- Clearing adapter cache with
Dynamoid.adapter.tables.clear - Ignoring DynamoDB requests in VCR
Key observation
Works perfectly with parallel_test -n 1 but fails randomly with higher concurrency. This suggests either:
- Race conditions in our reset logic
- DynamoDB local may not handle concurrent operations consistently
- Something in Dynamoid's adapter or our configuration isn't namespaced.
- Timing issues between delete/create cycles
Questions
- Are there known issues with parallel table operations against local DynamoDB?
- Should we be doing additional checks beyond table existence in
wait_for_table_readiness? - Any gotchas with
Dynamoid.adapter.tables.clearin parallel environments? - Better patterns for parallel test isolation?
The fact that it works fine single-threaded makes me think this is solvable - just not sure where the race condition is happening.