Skip to content

Commit 36d222f

Browse files
Improve HPO API with attach()
1 parent e9d4e56 commit 36d222f

File tree

2 files changed

+5
-13
lines changed

2 files changed

+5
-13
lines changed

FAQ.md

Lines changed: 3 additions & 12 deletions
Original file line numberDiff line numberDiff line change
@@ -396,15 +396,14 @@ See [#7](https://github.com/aws-samples/sagemaker-ssh-helper/issues/7) for this
396396

397397
In this case, `wrapper.get_instance_ids()` won't really work because you don't call `fit()` directly on the estimator and SSH Helper does not understand what training job you are trying to connect to.
398398

399-
You should use extra lower-level APIs to fetch the training job name of your interest first, and then either use `SSMManager` (recommended) or `SSHLog` (slower) to fetch their instance ids from the code:
399+
You should use the SageMaker Python SDK API to fetch the training job name of your interest first, and then use `SSHEstimatorWrapper.attach()` method to create a wrapper for fetching instance ids:
400400

401401
```python
402402
import time
403403

404404
from sagemaker.mxnet import MXNet
405405
from sagemaker.tuner import HyperparameterTuner
406406

407-
from sagemaker_ssh_helper.manager import SSMManager
408407
from sagemaker_ssh_helper.wrapper import SSHEstimatorWrapper
409408

410409
estimator = MXNet(...)
@@ -430,7 +429,8 @@ analytics = tuner.analytics()
430429
training_jobs = analytics.training_job_summaries()
431430
training_job_name = training_jobs[0]['TrainingJobName']
432431

433-
instance_ids = SSMManager().get_training_instance_ids(training_job_name, 300)
432+
ssh_wrapper = SSHEstimatorWrapper.attach(training_job_name)
433+
instance_ids = ssh_wrapper.get_instance_ids()
434434

435435
print(f'To connect over SSM run: aws ssm start-session --target {instance_ids[0]}')
436436
```
@@ -589,15 +589,6 @@ aws ssm start-session --target mi-01234567890abcdef \
589589
tail /var/log/amazon/ssm/*.log && date
590590
```
591591
592-
Note that the error messages related to `EC2Identity` are not relevant, because SageMaker is a managed service and users have no access to underlying EC2 infrastructure:
593-
594-
```text
595-
2023-03-27 20:07:23 ERROR [EC2Identity] failed to get identity instance id. Error: RequestError: send request failed
596-
caused by: Get "http://169.254.169.254/latest/meta-data/instance-id": dial tcp 169.254.169.254:80: connect: invalid argument
597-
```
598-
599-
These messages are kind of expected and can be safely ignored.
600-
601592
* Check that `sshd` process is started in SageMaker Studio notebook by running a command in the image terminal:
602593
603594
```shell

tests/test_hpo.py

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -95,7 +95,8 @@ def test_hpo_ssh():
9595
training_jobs = analytics.training_job_summaries()
9696
training_job_name = training_jobs[0]['TrainingJobName']
9797

98-
instance_ids = SSMManager().get_training_instance_ids(training_job_name, 300)
98+
ssh_wrapper = SSHEstimatorWrapper.attach(training_job_name)
99+
instance_ids = ssh_wrapper.get_instance_ids()
99100
assert len(instance_ids) == 1
100101

101102
tuner.wait()

0 commit comments

Comments
 (0)