-
Notifications
You must be signed in to change notification settings - Fork 1.2k
Description
PySDK Version
- PySDK V2 (2.x)
- PySDK V3 (3.x)
Describe the bug
In v2, it was possible to pass pipeline variables to environment variables into an Estimator. For example, an environment variable "RANDOM_STATE" could be set to a value given as a pipeline variable. This does not work in v3 in a ModelTrainer instance, resulting in a validation error: ValidationError: 1 validation error for ModelTrainer. A similar issue exists in the HyperparameterTuner which does not accept a pipeline variable for the random_seed, returning ValidationError: 1 validation error for HyperParameterTuningJobConfig.
To reproduce
Create ModelTrainer instance and pass a pipeline parameter into the environment argument dictionary.
Create a HyperparameterTuner instance and pass a pipeline parameter into the random_seed argument.
model = ModelTrainer(
source_code=source_code,
compute=compute,
networking=self.networking,
base_job_name=base_job_name,
training_image=self.image_uris["train"],
output_data_config=OutputDataConfig(
s3_output_path=Join(
on="/",
values=[
self.s3_uri_runtime,
ExecutionVariables.PIPELINE_EXECUTION_ID,
"02_mt_output",
],
),
kms_key_id=self.aws_params["kms_key_hub"],
),
stopping_condition=StoppingCondition(max_runtime_in_seconds=28800),
role=self.aws_params["exec_role"],
sagemaker_session=self.pipeline_session,
environment={
"RANDOM_STATE": self.pipeline_params["RandomState"].to_string(), # <-- This line causes issues
**self.default_env_vars,
},
)
hyperparameter_tuner = HyperparameterTuner(
model_trainer=model,
base_tuning_job_name=base_job_name,
metric_definitions=metric_definitions,
objective_metric_name=self.hpt_params["objective_metric_name"],
objective_type=self.hpt_params["objective_type"],
hyperparameter_ranges=self.hpt_params["hyperparameter_ranges"],
max_jobs=self.hpt_params["max_jobs"],
strategy="Bayesian",
max_parallel_jobs=4,
random_seed=self.pipeline_params["RandomState"], # <-- This line causes issues
tags=self.tags,
)
Expected behavior
Pipeline variables should be able to affect environment variables, as well as the random_seed argument of the HyperparameterTuner.
System information
A description of your system. Please provide:
- SageMaker Python SDK version: 3.5.0
Additional context
This is a roadblock for us regarding a migration from v2 to v3.