-
Notifications
You must be signed in to change notification settings - Fork 1.2k
Description
Describe the feature you'd like
The sagemaker-mlflow plugin (PyPI) handles authentication to SageMaker MLflow Apps via AWS Signature Version 4 (SigV4) and allows connecting to a tracking server using its ARN.
However, it currently operates as a separate, independently installed package that must be explicitly configured alongside the SageMaker Python SDK v3.
This creates friction in the most common ML workflows in SageMaker model training, experiment tracking, model registry, and model deployment where users must:
- Separately install sagemaker-mlflow in addition to sagemaker Python SDK v3.
- Manage SigV4 authentication independently from the SageMaker SDK session and execution role
- Manually call
mlflow.set_tracking_uri(arn)with the tracking server ARN before any MLflow logging - Repeat this boilerplate setup across training scripts, Processing Jobs, Pipelines steps, and deployment notebooks.
This feature request proposes first-class sagemaker-mlflow support within the SageMaker Python SDK v3, so that installation, authentication, Sagemaker tracking server connection, and MLflow context are automatically resolved from the existing SageMaker session and execution role — eliminating the need for users to manage two separate packages and authentication flows.
How would this feature be used? Please describe.
1. sagemaker-mlflow as an Optional Dependency of SageMaker Python SDK v3
Add sagemaker-mlflow as an optional dependency installable via:
pip install sagemaker[mlflow]
This ensures version compatibility between the SageMaker SDK V3 and the sagemaker-MLflow plugin v0.2.0 is managed automatically, eliminating the current manual version-matching requirement
2. sagemaker.Session MLflow Context — Auto-resolve Tracking Server ARN
Extend sagemaker.Session to support an optional mlflow_tracking_server_arn parameter, with automatic resolution from the SageMaker domain configuration:
import sagemaker
Auto-resolves sagemaker MLflow app ARN from the SageMaker domain
session = sagemaker.Session(mlflow_app_name="my-app")
Or explicit ARN
session = sagemaker.Session(
mlflow_tracking_server_arn="arn:aws:sagemaker:us-east-1:123456789:mlflow-tracking-server/my-app"
)
Automatically configures mlflow.set_tracking_uri() with SigV4 auth
session.configure_mlflow("my-app-exp-name")
3. Automatic MLflow Tracking URI Injection into Training Jobs (stretch goal)
Extend Estimator, PyTorch, HuggingFace, and other framework estimators to automatically inject the MLflow tracking URI and SigV4 credentials into the training container environment when a tracking server is configured on the session:
model_trainer = ModelTrainer(
training_image=xgboost_image,
source_code=source_code,
compute=compute,
hyperparameters=hyperparameters,
sagemaker_session=session # MLflow tracking URI auto-injected
)
No manual environment={"MLFLOW_TRACKING_URI": arn} required.
Describe alternatives you've considered
A clear and concise description of any alternative solutions or features you've considered.
| Alternatives considered | Reasons Not Preferred |
|---|---|
| Continue requiring separate | install sagemaker-mlflow pypi every time |
| Use environment variables manually | Error-prone; not portable across local, Studio, and CI/CD environments |
| Use SageMaker Experiments (non-MLflow) | Separate API; doesn't leverage MLflow ecosystem and existing customer investment |
Additional context
Add any other context or screenshots about the feature request here.
https://pypi.org/project/sagemaker-mlflow/