Skip to content

Commit 9f8c943

Browse files
[BACKPORT 2024.1][yugabyte#23367] CDCSDK: Cleanup expired and not of interest tables from CDC stream
Summary: ####Backport Description Major merge conflicts were encountered in cdc_service.cc in PopulateTabletCheckPointInfo() method because the refactoring present in master is not present in 2024.1 ####Original Description Currently, whenever a stream expires or a table becomes not of interest due to lack of polling on it, the entries corresponding to such tables are neither removed from the stream metadata nor are they removed from the state table. Even though the resources retained on tablets of such tables are released, the presence of state table entry can potentially stop the physical deletion of a split tablet. This diff adds cleanup mechanism for removing the entries corresponding to the expired / not of interest tables from stream metadata as well as the cdc_state table. Update Peers and Metrics reads the state table periodically and checks for the entries that have either reached stream expiry or have become not of interest. Currently this check is done for releasing the retained resources. With the changes introduced in this diff, we will find out the `{table_id, stream_id}` pairs which have expired or have become not of interest. A new rpc `RemoveTablesFromCDCSDKStream` has been introduced. This rpc takes a list of tables to be removed and the stream_id from which these tables need to be removed in its request. For each pair it then calls the `RemoveUserTablesFromCDCSDKStream` which does the cleanup from stream metadata and cdc_state table. Generally, `RemoveTablesFromCDCSDKStream` will be called for a single `{table_id, stream_id}` pair, however in case of colocated tables, all the colocated tables which have become not of interest / expired for a stream will be processed in a single call. The determination of a table being not of interest is done on the basis of active_time. In a scenario where lot of tables are present in database, but the user is interested in capturing only a small subset of these using CDC, it can happen that a large number of tables become not of interest at the same time. In such a case, there is a possibility of Update Peers and Metrics (which runs on each node) storming the master with cleanup requests. To prevent this, two throttling mechanisms have been put in place: - For each expired table, the node hosting the leader of tablet with lexicographically smallest tablet_id will send the cleanup request to master. This will ensure that only one request will be sent to the master per expired table. - If colocated tables become not of interest (or get expired for a stream), then all the colocated tables on the tablet will be cleaned up in a single call. - The flag `cdcsdk_max_expired_tables_to_clean_per_run` determines the maximum number of cleanup requests sent from each node per iteration of Update Peers and Metrics. The default value of this flag is 1. - With default settings the maximum number of cleanup requests sent to the master = **min(num of nodes, num of expired non-colocated tables + num of expired colocated tablets).** A new kLocalPersisted tserver auto flag `cdcsdk_enable_cleanup_of_expired_table_entries` has been introduced with default value false. To enable this cleanup logic, this flag must be set to true. **Upgrade/Rollback safety:** This diff introduces a new rpc: - RemoveUserTablesFromCDCSDKStream : RemoveUserTablesFromCDCSDKStreamRequestPB, RemoveUserTablesFromCDCSDKStreamResponsePB All the fields required to populate the request of this rpc are already present at the caller. The flags `cdcsdk_enable_cleanup_of_expired_table_entries` and `cdcsdk_enable_dynamic_table_addition_with_table_cleanup` protect this new rpc. Jira: DB-12291 Original commit: 0ea4f54 / D37450 Test Plan: Jenkins: urgent ./yb_build.sh --cxx-test integration-tests_cdcsdk_ysql-test --gtest_filter CDCSDKYsqlTest.TestCleanupOfTableNotOfInterest ./yb_build.sh --cxx-test integration-tests_cdcsdk_ysql-test --gtest_filter CDCSDKYsqlTest.TestCleanupOfExpiredTable ./yb_build.sh --cxx-test integration-tests_cdcsdk_ysql-test --gtest_filter CDCSDKYsqlTest.TestCleanupOfUnpolledTableWithTabletSplit ./yb_build.sh --cxx-test integration-tests_cdcsdk_ysql-test --gtest_filter CDCSDKYsqlTest.TestSplitOfTabletNotOfInterestDuringCleanup ./yb_build.sh --cxx-test integration-tests_cdcsdk_ysql-test --gtest_filter CDCSDKYsqlTest.TestCleanupOfNotOfInterestColocatedTabletWithMultipleStreams Reviewers: skumar, siddharth.shah, asrinivasan, xCluster, hsunder Reviewed By: siddharth.shah Subscribers: ycdcxcluster, ybase Tags: #jenkins-ready Differential Revision: https://phorge.dev.yugabyte.com/D37745
1 parent 4bcdb6c commit 9f8c943

14 files changed

+525
-13
lines changed

src/yb/cdc/cdc_service.cc

Lines changed: 129 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -189,6 +189,20 @@ DEFINE_test_flag(bool, cdcsdk_skip_stream_active_check, false,
189189
"When enabled, GetChanges will skip checking if stream is active as well as skip "
190190
"updating the active time.");
191191

192+
DEFINE_RUNTIME_int32(
193+
cdcsdk_max_expired_tables_to_clean_per_run, 1,
194+
"This flag determines the maximum number of tables to be cleaned up per run of "
195+
"UpdatePeersAndMetrics. Since a lot of tables can become not of interest at the same time, "
196+
"this flag is used to prevent storming of cleanup requests to master. When the flag value is "
197+
"1, the number of cleanup requests sent will be min(num_tables_to_cleanup, num_of_nodes)");
198+
199+
DEFINE_RUNTIME_AUTO_bool(
200+
cdcsdk_enable_cleanup_of_expired_table_entries, kLocalPersisted, false, true,
201+
"When enabled, Update Peers and Metrics will look for entries in the state table that have "
202+
"either become not of interest or have expired for a stream. The cleanup logic will then "
203+
"update these entries in cdc_state table and also move the corresponing table's entry to "
204+
"unqualified tables list in stream metadata.");
205+
192206
DECLARE_bool(enable_log_retention_by_op_idx);
193207

194208
DECLARE_int32(cdc_checkpoint_opid_interval_ms);
@@ -207,6 +221,8 @@ DEFINE_RUNTIME_bool(enable_cdcsdk_lag_collection, false,
207221
"When enabled, vlog containing the lag for the getchanges call as well as last commit record "
208222
"in response will be printed.");
209223

224+
DECLARE_bool(cdcsdk_enable_dynamic_table_addition_with_table_cleanup);
225+
210226
METRIC_DEFINE_entity(xcluster);
211227

212228
METRIC_DEFINE_entity(cdcsdk);
@@ -2564,9 +2580,41 @@ CDCServiceImpl::GetNamespaceMinRecordIdCommitTimeMap(
25642580
return namespace_to_min_record_id_commit_time;
25652581
}
25662582

2583+
void CDCServiceImpl::AddTableToExpiredTablesMap(
2584+
const TabletId& tablet_id, const xrepl::StreamId& stream_id,
2585+
TableIdToStreamIdMap* expired_tables_map) {
2586+
auto tablet_peer = context_->LookupTablet(tablet_id);
2587+
if (!tablet_peer) {
2588+
LOG(WARNING) << "Could not find tablet peer for tablet_id: " << tablet_id
2589+
<< ". Will not remove its expired entry in this round";
2590+
return;
2591+
}
2592+
2593+
auto table_ids = tablet_peer->tablet_metadata()->colocated()
2594+
? tablet_peer->tablet_metadata()->GetAllColocatedTables()
2595+
: std::vector<TableId>{tablet_peer->tablet_metadata()->table_id()};
2596+
2597+
for (const auto& table_id : table_ids) {
2598+
auto it = expired_tables_map->find(table_id);
2599+
if (it != expired_tables_map->end()) {
2600+
// The cleanup request to master should be sent exactly once per {table, stream} pair. To
2601+
// ensure this, the node hosting the leader of the tablet with lexicographically smallest
2602+
// tablet_id will send the request.
2603+
if ((tablet_peer->tablet_id() <= it->second.first)) {
2604+
it->second.first = tablet_peer->tablet_id();
2605+
it->second.second.insert(stream_id);
2606+
}
2607+
} else {
2608+
expired_tables_map->emplace(
2609+
table_id,
2610+
std::make_pair(tablet_peer->tablet_id(), std::unordered_set<xrepl::StreamId>{stream_id}));
2611+
}
2612+
}
2613+
}
2614+
25672615
Result<TabletIdCDCCheckpointMap> CDCServiceImpl::PopulateTabletCheckPointInfo(
25682616
const TabletId& input_tablet_id, TabletIdStreamIdSet* tablet_stream_to_be_deleted,
2569-
StreamIdSet* slot_entries_to_be_deleted) {
2617+
StreamIdSet* slot_entries_to_be_deleted, TableIdToStreamIdMap* expired_tables_map) {
25702618
TabletIdCDCCheckpointMap tablet_min_checkpoint_map;
25712619
std::unordered_set<xrepl::StreamId> refreshed_metadata_set;
25722620

@@ -2738,6 +2786,12 @@ Result<TabletIdCDCCheckpointMap> CDCServiceImpl::PopulateTabletCheckPointInfo(
27382786
auto status = CheckTabletNotOfInterest(
27392787
producer_tablet, last_active_time_cdc_state_table, true);
27402788
if (!status.ok()) {
2789+
// If checkpoint is max, it indicates that cleanup is already in progress. No need to add
2790+
// such entries to the expired_tables_map.
2791+
if (expired_tables_map && checkpoint != OpId::Max()) {
2792+
AddTableToExpiredTablesMap(tablet_id, stream_id, expired_tables_map);
2793+
}
2794+
27412795
if (!tablet_min_checkpoint_map.contains(tablet_id)) {
27422796
VLOG(2) << "Stream: " << stream_id << ", is not of interest for tablet: " << tablet_id
27432797
<< ", hence we are adding default entries to tablet_min_checkpoint_map";
@@ -2749,6 +2803,12 @@ Result<TabletIdCDCCheckpointMap> CDCServiceImpl::PopulateTabletCheckPointInfo(
27492803
}
27502804
status = CheckStreamActive(producer_tablet, last_active_time_cdc_state_table);
27512805
if (!status.ok()) {
2806+
// If checkpoint is max, it indicates that cleanup is already in progress. No need to add
2807+
// such entries to the expired_tables_map.
2808+
if (expired_tables_map && checkpoint != OpId::Max()) {
2809+
AddTableToExpiredTablesMap(tablet_id, stream_id, expired_tables_map);
2810+
}
2811+
27522812
// It is possible that all streams associated with a tablet have expired, in which case we
27532813
// have to create a default entry in 'tablet_min_checkpoint_map' corresponding to the
27542814
// tablet. This way the fact that all the streams have expired will be communicated to the
@@ -3084,8 +3144,9 @@ void CDCServiceImpl::UpdatePeersAndMetrics() {
30843144
// if we fail to read cdc_state table, lets wait for the next retry after 60 secs.
30853145
TabletIdStreamIdSet cdc_state_entries_to_delete;
30863146
StreamIdSet slot_entries_to_be_deleted;
3087-
auto result =
3088-
PopulateTabletCheckPointInfo("", &cdc_state_entries_to_delete, &slot_entries_to_be_deleted);
3147+
TableIdToStreamIdMap expired_tables_map;
3148+
auto result = PopulateTabletCheckPointInfo(
3149+
"", &cdc_state_entries_to_delete, &slot_entries_to_be_deleted, &expired_tables_map);
30893150
if (!result.ok()) {
30903151
LOG(WARNING) << "Failed to populate tablets checkpoint info: " << result.status();
30913152
continue;
@@ -3116,6 +3177,13 @@ void CDCServiceImpl::UpdatePeersAndMetrics() {
31163177
cdc_state_entries_to_delete, failed_tablet_ids, slot_entries_to_be_deleted),
31173178
"Unable to cleanup CDC State table metadata");
31183179

3180+
if (GetAtomicFlag(&FLAGS_cdcsdk_enable_cleanup_of_expired_table_entries) &&
3181+
GetAtomicFlag(&FLAGS_cdcsdk_enable_dynamic_table_addition_with_table_cleanup)) {
3182+
WARN_NOT_OK(
3183+
CleanupExpiredTables(expired_tables_map),
3184+
"Failed to remove an expired table entry from stream");
3185+
}
3186+
31193187
rate_limiter_->SetBytesPerSecond(
31203188
GetAtomicFlag(&FLAGS_xcluster_get_changes_max_send_rate_mbps) * 1_MB);
31213189

@@ -3165,6 +3233,64 @@ Status CDCServiceImpl::DeleteCDCStateTableMetadata(
31653233
return Status::OK();
31663234
}
31673235

3236+
Status CDCServiceImpl::CleanupExpiredTables(const TableIdToStreamIdMap& expired_tables_map) {
3237+
if (expired_tables_map.empty()) {
3238+
return Status::OK();
3239+
}
3240+
3241+
int num_cleanup_requests = 0;
3242+
for (const auto& entry : expired_tables_map) {
3243+
const auto& table_id = entry.first;
3244+
const auto& tablet_id = entry.second.first;
3245+
const auto& streams = entry.second.second;
3246+
3247+
auto tablet_peer = context_->LookupTablet(tablet_id);
3248+
if (!tablet_peer) {
3249+
LOG(WARNING) << "Could not find tablet peer for tablet_id: " << tablet_id
3250+
<< ", for table: " << table_id
3251+
<< ". Will not remove its entries from state table in this round.";
3252+
continue;
3253+
}
3254+
3255+
if (tablet_peer->IsNotLeader()) {
3256+
continue;
3257+
}
3258+
3259+
for (const auto& stream_id : streams) {
3260+
if (num_cleanup_requests >=
3261+
GetAtomicFlag(&FLAGS_cdcsdk_max_expired_tables_to_clean_per_run)) {
3262+
return Status::OK();
3263+
}
3264+
3265+
auto colocated = tablet_peer->tablet_metadata()->colocated();
3266+
3267+
auto table_ids = colocated
3268+
? tablet_peer->tablet_metadata()->GetAllColocatedTables()
3269+
: std::vector<TableId>{table_id};
3270+
3271+
if (colocated) {
3272+
for (auto it = table_ids.begin(); it != table_ids.end();) {
3273+
if (boost::ends_with(*it, kColocatedDbParentTableIdSuffix) ||
3274+
boost::ends_with(*it, kTablegroupParentTableIdSuffix) ||
3275+
boost::ends_with(*it, kColocationParentTableIdSuffix)) {
3276+
it = table_ids.erase(it);
3277+
} else {
3278+
++it;
3279+
}
3280+
}
3281+
}
3282+
3283+
auto status = client()->RemoveTablesFromCDCSDKStream(table_ids, stream_id);
3284+
if (!status.ok()) {
3285+
LOG(WARNING) << "Failed to remove table: " << table_id << " from stream: " << stream_id
3286+
<< " : " << status;
3287+
}
3288+
num_cleanup_requests++;
3289+
}
3290+
}
3291+
return Status::OK();
3292+
}
3293+
31683294
Result<client::internal::RemoteTabletPtr> CDCServiceImpl::GetRemoteTablet(
31693295
const TabletId& tablet_id, const bool use_cache) {
31703296
std::promise<Result<client::internal::RemoteTabletPtr>> tablet_lookup_promise;

src/yb/cdc/cdc_service.h

Lines changed: 13 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -107,6 +107,8 @@ struct TabletCDCCheckpointInfo {
107107
using TabletIdCDCCheckpointMap = std::unordered_map<TabletId, TabletCDCCheckpointInfo>;
108108
using TabletIdStreamIdSet = std::set<std::pair<TabletId, xrepl::StreamId>>;
109109
using StreamIdSet = std::set<xrepl::StreamId>;
110+
using TableIdToStreamIdMap =
111+
std::unordered_map<TableId, std::pair<TabletId, std::unordered_set<xrepl::StreamId>>>;
110112
using RollBackTabletIdCheckpointMap =
111113
std::unordered_map<const std::string*, std::pair<int64_t, OpId>>;
112114
class CDCServiceImpl : public CDCServiceIf {
@@ -422,6 +424,10 @@ class CDCServiceImpl : public CDCServiceIf {
422424
const std::unordered_set<TabletId>& failed_tablet_ids,
423425
const StreamIdSet& slot_entries_to_be_deleted);
424426

427+
// This method sends an rpc to the master to remove the expired / not of interest tables from the
428+
// stream metadata and update the checkpoint of cdc_state entries to max.
429+
Status CleanupExpiredTables(const TableIdToStreamIdMap& expired_tables_map);
430+
425431
MicrosTime GetLastReplicatedTime(const std::shared_ptr<tablet::TabletPeer>& tablet_peer);
426432

427433
bool ShouldUpdateMetrics(MonoTime time_since_update_metrics);
@@ -454,7 +460,13 @@ class CDCServiceImpl : public CDCServiceIf {
454460
Result<TabletIdCDCCheckpointMap> PopulateTabletCheckPointInfo(
455461
const TabletId& input_tablet_id = "",
456462
TabletIdStreamIdSet* tablet_stream_to_be_deleted = nullptr,
457-
StreamIdSet* slot_entries_to_be_deleted = nullptr);
463+
StreamIdSet* slot_entries_to_be_deleted = nullptr,
464+
TableIdToStreamIdMap* expired_tables_map = nullptr);
465+
466+
void AddTableToExpiredTablesMap(
467+
const TabletId& tablet_id,
468+
const xrepl::StreamId& stream_id,
469+
TableIdToStreamIdMap* expired_tables_map);
458470

459471
Status SetInitialCheckPoint(
460472
const OpId& checkpoint, const std::string& tablet_id,

src/yb/client/client-internal.cc

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -304,6 +304,7 @@ YB_CLIENT_SPECIALIZE_SIMPLE_EX(Replication, GetCDCDBStreamInfo);
304304
YB_CLIENT_SPECIALIZE_SIMPLE_EX(Replication, GetCDCStream);
305305
YB_CLIENT_SPECIALIZE_SIMPLE_EX(Replication, ListCDCStreams);
306306
YB_CLIENT_SPECIALIZE_SIMPLE_EX(Replication, UpdateCDCStream);
307+
YB_CLIENT_SPECIALIZE_SIMPLE_EX(Replication, RemoveTablesFromCDCSDKStream);
307308
YB_CLIENT_SPECIALIZE_SIMPLE_EX(Replication, IsObjectPartOfXRepl);
308309
YB_CLIENT_SPECIALIZE_SIMPLE_EX(Replication, IsBootstrapRequired);
309310
YB_CLIENT_SPECIALIZE_SIMPLE_EX(Replication, GetUDTypeMetadata);

src/yb/client/client.cc

Lines changed: 22 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1783,6 +1783,28 @@ Status YBClient::UpdateCDCStream(
17831783
return Status::OK();
17841784
}
17851785

1786+
Status YBClient::RemoveTablesFromCDCSDKStream(
1787+
const std::vector<TableId>& table_ids,
1788+
const xrepl::StreamId stream_id) {
1789+
if (table_ids.empty()) {
1790+
return STATUS(InvalidArgument, "Table ID should not be empty");
1791+
}
1792+
if (!stream_id) {
1793+
return STATUS(InvalidArgument, "Stream ID should not be empty");
1794+
}
1795+
1796+
master::RemoveTablesFromCDCSDKStreamRequestPB req;
1797+
master::RemoveTablesFromCDCSDKStreamResponsePB resp;
1798+
req.set_stream_id(stream_id.ToString());
1799+
req.mutable_table_ids()->Reserve(narrow_cast<int>(table_ids.size()));
1800+
for (const auto& table_id : table_ids) {
1801+
req.add_table_ids(table_id);
1802+
}
1803+
1804+
CALL_SYNC_LEADER_MASTER_RPC_EX(Replication, req, resp, RemoveTablesFromCDCSDKStream);
1805+
return Status::OK();
1806+
}
1807+
17861808
Result<bool> YBClient::IsObjectPartOfXRepl(const TableId& table_id) {
17871809
IsObjectPartOfXReplRequestPB req;
17881810
IsObjectPartOfXReplResponsePB resp;

src/yb/client/client.h

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -677,6 +677,10 @@ class YBClient {
677677
const std::vector<xrepl::StreamId>& stream_ids,
678678
const std::vector<master::SysCDCStreamEntryPB>& new_entries);
679679

680+
Status RemoveTablesFromCDCSDKStream(
681+
const std::vector<TableId>& table_id,
682+
const xrepl::StreamId stream_id);
683+
680684
Result<bool> IsObjectPartOfXRepl(const TableId& table_id);
681685

682686
Result<bool> IsBootstrapRequired(

src/yb/common/common_flags.cc

Lines changed: 9 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -178,6 +178,15 @@ DEFINE_RUNTIME_AUTO_bool(enable_xcluster_auto_flag_validation, kLocalPersisted,
178178
DEFINE_RUNTIME_AUTO_PG_FLAG(bool, yb_enable_ddl_atomicity_infra, kLocalPersisted, false, true,
179179
"Enables YSQL DDL atomicity");
180180

181+
DEFINE_RUNTIME_AUTO_bool(cdcsdk_enable_dynamic_table_addition_with_table_cleanup,
182+
kLocalPersisted,
183+
false,
184+
true,
185+
"This flag needs to be true in order to support addition of dynamic tables "
186+
"along with removal of not of interest/expired tables from a CDCSDK "
187+
"stream.");
188+
TAG_FLAG(cdcsdk_enable_dynamic_table_addition_with_table_cleanup, advanced);
189+
181190
namespace yb {
182191

183192
void InitCommonFlags() {

src/yb/integration-tests/cdcsdk_consistent_snapshot-test.cc

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1383,6 +1383,9 @@ TEST_F(CDCSDKConsistentSnapshotTest, TestConsistentSnapshotAcrossMultipleTables)
13831383
TEST_F(CDCSDKConsistentSnapshotTest, TestReleaseResourcesOnUnpolledTablets) {
13841384
ANNOTATE_UNPROTECTED_WRITE(FLAGS_update_min_cdc_indices_interval_secs) = 1;
13851385
ANNOTATE_UNPROTECTED_WRITE(FLAGS_cdcsdk_tablet_not_of_interest_timeout_secs) = 3;
1386+
// Since the test requires the state table entries to verify the release of resources, we disable
1387+
// the cleanup of not of interest tables for this test.
1388+
ANNOTATE_UNPROTECTED_WRITE(FLAGS_cdcsdk_enable_cleanup_of_expired_table_entries) = false;
13861389
ASSERT_OK(SetUpWithParams(1, 1, false));
13871390

13881391
auto conn = ASSERT_RESULT(test_cluster_.ConnectToDB(kNamespaceName));

0 commit comments

Comments
 (0)