Skip to content

Conversation

@toddtomashek-c2l
Copy link

The input value of "gaudi3" is not preserved in the "gaudi_platform" variable because the "cpu_or_gpu" value is reset to "g" after initial processing. Subsequent calls of read_config_file() will see 'cpu_or_gpu=="g"' and reset gaudi_platform to "gaudi2".

This PR prevents that by only setting "gaudi_platform" the first time it is called.

@mdfaheem-intel
Copy link
Collaborator

We are currently in process to review this fix, could you please share the detailed steps to reproduce this issue at my end.

@toddtomashek-c2l
Copy link
Author

You can reproduce by using the menu to deploy models (we saw using deepseek-r1-qwen) when the inference-config file does not have ONLY "deploy_llm_models" = "on." For example, on execution of the script for an initial deployment that includes "deploy_kubernetes_fresh=on", "deploy_ingress_controller=on" and "deploy_llm_models=on" with a "cpu-or-gpu" value of "gaudi3", the "read_config_file()" function will be called multiple times and the "gaudi3" values will be overwritten to "gaudi2" because after the first execution of "read_config_file()" the cpu_or_gpu variable will be changed from "gaudi3" (original input) to "g"

@mdfaheem-intel
Copy link
Collaborator

You can reproduce by using the menu to deploy models (we saw using deepseek-r1-qwen) when the inference-config file does not have ONLY "deploy_llm_models" = "on." For example, on execution of the script for an initial deployment that includes "deploy_kubernetes_fresh=on", "deploy_ingress_controller=on" and "deploy_llm_models=on" with a "cpu-or-gpu" value of "gaudi3", the "read_config_file()" function will be called multiple times and the "gaudi3" values will be overwritten to "gaudi2" because after the first execution of "read_config_file()" the cpu_or_gpu variable will be changed from "gaudi3" (original input) to "g"

Thanks for sharing the detailed steps, let me validate this fix and provide you the update

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants