Skip to content

restore script logic issue #20

@116davinder

Description

@116davinder

The below Function only returns the count and later we use for on this count variable to fetch backup files.

    def s3_count_partitions(s3_client,bucket,topic):
        """It will return number of objects in a given s3 bucket and s3 bucket path."""

        try:
            return s3_client.list_objects_v2(
                Bucket=bucket,
                Prefix=topic + "/",
                Delimiter='/'
            )['KeyCount']
        except NoCredentialsError as e:
            logging.error(e)
            exit(1)

Edge Case:
Initial Partition Count: 10
Backup script copies partitions : 0,1,4,6,7,8 only
Restore script aka above-mentioned function will return Count: 6

Now when following for loop on for p in range(_pc): will only consider 0,1,2,3,4,5 partitions and few of them won't even
exists in S3 so it will keep failing for them

    def s3_download(bucket,topic,tmp_dir,retry_download_seconds=60):
        s3_client = boto3.client('s3')
        while True:
            _pc = Download.s3_count_partitions(s3_client,bucket,topic)
            # create temp. topic directory
            for p in range(_pc):
                os.makedirs(os.path.join(tmp_dir,topic,str(p)),exist_ok=True)

            for p in range(_pc):
                os.makedirs(os.path.join(tmp_dir,topic,str(p)),exist_ok=True)

            for _pt in range(_pc):

                _ck = checkpoint.read_checkpoint_partition(tmp_dir,topic,str(_pt))
                _partition_path = os.path.join(topic,str(_pt))
                _s3_partition_files = Download.s3_list_files(s3_client,bucket,_partition_path)

Error

davinderpal@DESKTOP-07TAJVL:~/projects/apache-kafka-backup-and-restore$ python3 restore.py example-jsons/restore-s3.json
{ "@timestamp": "2022-10-17 23:30:34,750","level": "INFO","thread": "Kafka Restore Thread","name": "root","message": "retry for more files in /tmp/davinder.test after 100" }
{ "@timestamp": "2022-10-17 23:30:34,853","level": "INFO","thread": "MainThread","name": "root","message": "Test messeage" }
{ "@timestamp": "2022-10-17 23:30:34,861","level": "INFO","thread": "MainThread","name": "botocore.credentials","message": "Found credentials in environment variables." }
{ "@timestamp": "2022-10-17 23:30:34,909","level": "WARNING","thread": "MainThread","name": "root","message": "[Errno 2] No such file or directory: '/tmp/davinder.test/0/checkpoint'" }
{ "@timestamp": "2022-10-17 23:30:34,931","level": "WARNING","thread": "MainThread","name": "root","message": "[Errno 2] No such file or directory: '/tmp/davinder.test/2/checkpoint'" }
{ "@timestamp": "2022-10-17 23:30:34,938","level": "WARNING","thread": "MainThread","name": "root","message": "[Errno 2] No such file or directory: '/tmp/davinder.test/3/checkpoint'" }
{ "@timestamp": "2022-10-17 23:30:34,946","level": "WARNING","thread": "MainThread","name": "root","message": "[Errno 2] No such file or directory: '/tmp/davinder.test/4/checkpoint'" }
{ "@timestamp": "2022-10-17 23:30:34,985","level": "INFO","thread": "MainThread","name": "root","message": "download success for /tmp/davinder.test/4/20221017-230228.tar.gz and its sha256 file " }
{ "@timestamp": "2022-10-17 23:30:34,985","level": "WARNING","thread": "MainThread","name": "root","message": "[Errno 2] No such file or directory: '/tmp/davinder.test/5/checkpoint'" }
{ "@timestamp": "2022-10-17 23:30:34,993","level": "INFO","thread": "MainThread","name": "root","message": "retry for new file after 100s in s3://kafka-backup/davinder.test" }

Potential Solution:
Instead of returning the count of partitions, we can return the actual number of partitons but with regex or split method to extract from
list_objects_v2 method call

Metadata

Metadata

Assignees

Labels

bugSomething isn't workingenhancementNew feature or request

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions