-
Notifications
You must be signed in to change notification settings - Fork 1.2k
Description
PySDK Version
- PySDK V2 (2.x)
- PySDK V3 (3.x)
Describe the bug
Tuning job does not add environment variables to the training jobs it creates. No environment variables are set, despite the environment variables being defined in the ModelTrainer correctly. The HyperparameterTuner does not seem to correctly propagate them.
To reproduce
Inside of a TuningStep pipeline step, use a ModelTrainer and add environment variables via the environment argument. Then define a HyperparameterTuner and pass the model_trainer object to it. Do all this inside of a pipeline session, such that .tune() will return the Arguments for the TuningStep. The resulting JSON of the pipeline definition will not contain an "Environment" key with environment variables inside the "TrainingJobDefinition" key within the "Arguments" for the tuning step.
model = ModelTrainer(
source_code=source_code,
compute=compute,
networking=self.networking,
base_job_name=base_job_name,
training_image=self.image_uris["train"],
output_data_config=OutputDataConfig(
s3_output_path=Join(
on="/",
values=[
self.s3_uri_runtime,
ExecutionVariables.PIPELINE_EXECUTION_ID,
"02_mt_output",
],
),
kms_key_id=self.aws_params["kms_key_hub"],
),
stopping_condition=StoppingCondition(max_runtime_in_seconds=28800),
role=self.aws_params["exec_role"],
sagemaker_session=self.pipeline_session,
environment={
"RANDOM_STATE": "42",
**self.default_env_vars,
},
# tags=self.tags_special,
)
hyperparameter_tuner = HyperparameterTuner(
model_trainer=model,
base_tuning_job_name=base_job_name,
metric_definitions=metric_definitions,
objective_metric_name=self.hpt_params["objective_metric_name"],
objective_type=self.hpt_params["objective_type"],
hyperparameter_ranges=self.hpt_params["hyperparameter_ranges"],
max_jobs=self.hpt_params["max_jobs"],
strategy="Bayesian",
max_parallel_jobs=4,
# random_seed=self.pipeline_params["RandomState"], # this is another bug, for another issue.
random_seed=42,
tags=self.tags,
)
Expected behavior
The environment variables passed as an argument to the ModelTrainer should be set in the training jobs created by a tuning job that uses this model trainer.
System information
A description of your system. Please provide:
- SageMaker Python SDK version: 3.5.0
Additional context
This is a roadblock for us regarding a migration from v2 to v3.