Skip to content
This repository was archived by the owner on Jul 18, 2024. It is now read-only.

Conversation

@xcnix
Copy link

@xcnix xcnix commented Aug 1, 2019

Add parameters (part_of_multisite) and some error codes(completion code during Creation of HyperSwap and multi-site volumes)


This change is Reviewable

@alonm
Copy link
Contributor

alonm commented Aug 1, 2019


pyxcli/errors.py, line 718 at r1 (raw file):

@CommandExecutionError.register(
    "TRNS_MGMT_STATUS_OBJECTS_IN_DIFFERENT_DOMAINS")
class StorageObjectIsNotInDomain(CommandFailedRuntimeError):

Why did you remove the StorageObjectIsNotInDomain exception? exception?

@alonm
Copy link
Contributor

alonm commented Aug 1, 2019


pyxcli/mirroring/recovery_manager.py, line 283 at r1 (raw file):

            mirror type must be 'sync' or 'async',
            slave_resource_name would be the slave_vol or slave_cg name'''
        if part_of_multisite == 'yes':

Instead of writing the same 10 lines twice, I suggest setting the existing kwargs, and after it - conditionally - add the new argument

@alonm
Copy link
Contributor

alonm commented Aug 1, 2019


pyxcli/errors.py, line 715 at r1 (raw file):

@CommandExecutionError.register("MULTISITE_STANDBY_RELATION_ALREADY_DEFINED")
class StandbyAlreadyDefined(CommandFailedRuntimeError):

This is a VERY long list of exceptions. I bet many of them do not even exist on the storage, and probably most of them are not needed. You can ask core/architecture guys for clarifications on what is needed (e.g. Daniel Aizer) .

Copy link
Contributor

@alonm alonm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed 4 of 4 files at r1.
Reviewable status: all files reviewed, 3 unresolved discussions (waiting on @xcnix)

@ranhrl
Copy link
Collaborator

ranhrl commented Aug 4, 2019


pyxcli/errors.py, line 718 at r1 (raw file):

Previously, alonm wrote…

Why did you remove the StorageObjectIsNotInDomain exception? exception?

Good catch, please don't delete existing errors.

@ranhrl
Copy link
Collaborator

ranhrl commented Aug 4, 2019


pyxcli/mirroring/recovery_manager.py, line 283 at r1 (raw file):

Previously, alonm wrote…

Instead of writing the same 10 lines twice, I suggest setting the existing kwargs, and after it - conditionally - add the new argument

+1

@xcnix
Copy link
Author

xcnix commented Aug 9, 2019

pyxcli/errors.py, line 718 at r1 (raw file):

@CommandExecutionError.register(
    "TRNS_MGMT_STATUS_OBJECTS_IN_DIFFERENT_DOMAINS")
class StorageObjectIsNotInDomain(CommandFailedRuntimeError):

Why did you remove the StorageObjectIsNotInDomain exception? exception?
Delete it by mistake, already added it backed now.

@xcnix
Copy link
Author

xcnix commented Aug 9, 2019

pyxcli/mirroring/recovery_manager.py, line 283 at r1 (raw file):

            mirror type must be 'sync' or 'async',
            slave_resource_name would be the slave_vol or slave_cg name'''
        if part_of_multisite == 'yes':

Instead of writing the same 10 lines twice, I suggest setting the existing kwargs, and after it - conditionally - add the new argument

Thanks for the suggestion, I updated the code, please help to review again.

@xcnix
Copy link
Author

xcnix commented Aug 9, 2019

pyxcli/errors.py, line 715 at r1 (raw file):

@CommandExecutionError.register("MULTISITE_STANDBY_RELATION_ALREADY_DEFINED")
class StandbyAlreadyDefined(CommandFailedRuntimeError):

This is a VERY long list of exceptions. I bet many of them do not even exist on the storage, and probably most of them are not needed. You can ask core/architecture guys for clarifications on what is needed (e.g. Daniel Aizer) .

Hi Alon, I deleted some events which might not be used now and future, please help to review the change, Thanks.

@alonm
Copy link
Contributor

alonm commented Aug 9, 2019


pyxcli/errors.py, line 795 at r2 (raw file):

    "TRNS_MGMT_STATUS_OBJECTS_IN_DIFFERENT_DOMAINS")
class StorageObjectIsNotInDomain(CommandFailedRuntimeError):
    pass

StorageObjectIsNotInDomain appears twice.

@ranhrl
Copy link
Collaborator

ranhrl commented Aug 11, 2019


pyxcli/mirroring/cg_recovery_manager.py, line 108 at r2 (raw file):

                      slave_resource_name, rpo=None, remote_rpo=None,
                      schedule=None, remote_schedule=None,
                      activate_mirror='no', part_of_multisite='no'):

It looks like you are breaking backwards compatibility with this change. I would take a different approach by setting a default value of None. Then below, I would check, if the value is not None, I would add it to kwargs, instead of adding it and then removing it if the value is "no". Please fix accordingly.

@ranhrl
Copy link
Collaborator

ranhrl commented Aug 11, 2019


pyxcli/mirroring/recovery_manager.py, line 307 at r2 (raw file):

        if part_of_multisite == 'no':
            kwargs['part_of_multisite'] = None

Please see my comment above, only add to kwargs if the value is not None.

Copy link
Collaborator

@ranhrl ranhrl left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed 2 of 2 files at r2.
Reviewable status: all files reviewed, 7 unresolved discussions (waiting on @alonm and @xcnix)


pyxcli/mirroring/recovery_manager.py, line 295 at r2 (raw file):

            'schedule': schedule,
            'remote_schedule': remote_schedule,
            'part_of_multisite': part_of_multisite

Instead of adding it and then removing it,

@alonm
Copy link
Contributor

alonm commented Aug 11, 2019


pyxcli/mirroring/recovery_manager.py, line 295 at r2 (raw file):

Previously, ranhrl (Ran Harel) wrote…

Instead of adding it and then removing it,

+1. Nice catch

@xcnix
Copy link
Author

xcnix commented Aug 12, 2019

pyxcli/mirroring/cg_recovery_manager.py, line 108 at r2 (raw file):

                      slave_resource_name, rpo=None, remote_rpo=None,
                      schedule=None, remote_schedule=None,
                      activate_mirror='no', part_of_multisite='no'):

It looks like you are breaking backwards compatibility with this change. I would take a different approach by setting a default value of None. Then below, I would check, if the value is not None, I would add it to kwargs, instead of adding it and then removing it if the value is "no". Please fix accordingly.

Thanks, This is a much better method. I think original code will take out it if this value if None.

@xcnix
Copy link
Author

xcnix commented Aug 12, 2019

pyxcli/errors.py, line 795 at r2 (raw file):

    "TRNS_MGMT_STATUS_OBJECTS_IN_DIFFERENT_DOMAINS")
class StorageObjectIsNotInDomain(CommandFailedRuntimeError):
    pass

StorageObjectIsNotInDomain appears twice.

Thanks, already deleted dup class

Copy link
Collaborator

@ranhrl ranhrl left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed 4 of 4 files at r3.
Reviewable status: all files reviewed, 7 unresolved discussions (waiting on @alonm, @ranhrl, and @xcnix)

Copy link
Collaborator

@ranhrl ranhrl left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:lgtm:

Reviewable status: all files reviewed, 4 unresolved discussions (waiting on @alonm and @xcnix)

@alonm
Copy link
Contributor

alonm commented Aug 13, 2019


pyxcli/mirroring/recovery_manager.py, line 295 at r3 (raw file):

            'schedule': schedule,
            'remote_schedule': remote_schedule,
            'part_of_multisite': part_of_multisite

did you check that this works with older storage systems that did not have this parameter? I think it will break...

h, p, ssl)

@classmethod
def connect(cls, hostname, port, timeout=5.0):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did the network get slower these days? You need a very good reason to change this.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@alonm, for HyperSwap/multisite creation and deletion scenario, commands take more than 5 seconds to respond. sometimes it takes 30s.

Can we change timeout to a configurable value, so third party module can change this value on demands ?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is an optional parameter. The third-party module can specify this value on demand - so there is no reason to change this.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@alonm, XIV cinder driver use this method to initialize xcli connection
try: LOG.debug('connect_multiendpoint_ssl with: %s', address) xcli = client.XCLIClient.connect_multiendpoint_ssl( user, clear_pass, address, ca_certs=path)

but this method does not support timeout parameter
def connect_multiendpoint_ssl(cls, user, password, endpoints, auto_discover=True, ca_certs=None, validate=None):

@alonm
Copy link
Contributor

alonm commented Aug 22, 2019


pyxcli/mirroring/recovery_manager.py, line 295 at r3 (raw file):

Previously, alonm wrote…

did you check that this works with older storage systems that did not have this parameter? I think it will break...

Did you check this with old storage systems? It is better to set the rpo value only if it's needed, instead of setting it to null.

@alonm
Copy link
Contributor

alonm commented Aug 22, 2019


pyxcli/transports.py, line 157 at r5 (raw file):

    @classmethod
    def connect(cls, hostname, port, timeout=15.0):

I'm afraid I wasn't clear on my previous comment. You should not increase the timeout - unless you have a VERY good reason to do so. If you have such a reason, please provide it.

Copy link
Contributor

@alonm alonm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed 2 of 2 files at r2.
Reviewable status: 3 of 5 files reviewed, 3 unresolved discussions (waiting on @ranhrl and @xcnix)

Copy link
Collaborator

@ranhrl ranhrl left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed 2 of 2 files at r6.
Reviewable status: all files reviewed, 3 unresolved discussions (waiting on @alonm and @xcnix)


pyxcli/transports.py, line 157 at r5 (raw file):

Previously, alonm wrote…

I'm afraid I wasn't clear on my previous comment. You should not increase the timeout - unless you have a VERY good reason to do so. If you have such a reason, please provide it.

+1

@xcnix
Copy link
Author

xcnix commented Jan 14, 2020

pyxcli/mirroring/recovery_manager.py, line 295 at r3 (raw file):

            'schedule': schedule,
            'remote_schedule': remote_schedule,
            'part_of_multisite': part_of_multisite

did you check that this works with older storage systems that did not have this parameter? I think it will break...
If value of this key is None, this parameter will be popped out.

@wangpww wangpww self-requested a review January 16, 2020 10:20
@igorgonibm
Copy link


pyxcli/errors.py, line 1433 at r6 (raw file):

class RanOutOfEndpointError(IOError, TransportError):

please add comment

@igorgonibm
Copy link


pyxcli/transports.py, line 157 at r5 (raw file):

Previously, ranhrl (Ran Harel) wrote…

+1

+1
Is it really justified to increase timeout?

@igorgonibm
Copy link

No need to add multisite support?
Or update logic for operations on volume/cg with 2 mirrorings/hyperswaps? For some there was added additional argument "target" (e.g. mirror_activate)

@xcnix
Copy link
Author

xcnix commented Feb 11, 2020

No need to add multisite support?
Or update logic for operations on volume/cg with 2 mirrorings/hyperswaps? For some there was added additional argument "target" (e.g. mirror_activate)

Multisite support is required.

@xcnix
Copy link
Author

xcnix commented Feb 11, 2020

justified

PowerVC team and China team have test new timeout value for several months.
with multisite topology, the network lantency and system internal latency is much higher then before. The tertiary site could be in a very long distance away.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants