Skip to content

Conversation

@Ivecia
Copy link
Contributor

@Ivecia Ivecia commented Nov 4, 2025

Potentional deadlock may occur in trace_shader_core_ctx::get_next_inst. Once the last instruction is EXIT, and no more instructions in traces, m_trace_warp->trace_done() will be evaluated to be true. Since the number of instruction in pipeline will be increased at shader_core_ctx::decode() in gpgpu-sim, the !m_warp[warp_id]->inst_in_pipeline() will also be evaluated to be true. And finally, all threads will be set to completed, and cause a deadlock for issue stage.

Increasing inst_in_pipeline with in decode() function of gpgpu-sim:

https://github.com/accel-sim/gpgpu-sim_distribution/blob/b18ee3977962921c49fabe06d26ae19694497c26/src/gpgpu-sim/shader.cc#L896

Functional done will check if all threads have done:

https://github.com/accel-sim/gpgpu-sim_distribution/blob/b18ee3977962921c49fabe06d26ae19694497c26/src/gpgpu-sim/shader.cc#L4085C1-L4087C2

Issue stage will check if the warp is waiting, and one of the condition is whether it is functional_done:

https://github.com/accel-sim/gpgpu-sim_distribution/blob/b18ee3977962921c49fabe06d26ae19694497c26/src/gpgpu-sim/shader.cc#L4093C1-L4114C2

This PR provide a walk around for this situation. Before we check trace_done(), we should firstly check if get_next_trace_inst returns a valid instruction.

@JRPan JRPan requested a review from LAhmos November 25, 2025 18:50
@JRPan
Copy link
Collaborator

JRPan commented Nov 25, 2025

@LAhmos seems valid to me.

Seems to be caused by the fix you did to allow instructions after exit. Any comment? I'm checking your changes

@JRPan
Copy link
Collaborator

JRPan commented Nov 26, 2025

Good catch. This actually fixes weekly.

@JRPan JRPan self-requested a review November 26, 2025 01:53
@JRPan JRPan merged commit d7f397a into accel-sim:dev Nov 26, 2025
7 checks passed
@JRPan
Copy link
Collaborator

JRPan commented Nov 26, 2025

Thanks! @Ivecia

@Ivecia Ivecia deleted the fix-potential-deadlock-by-trace-done branch November 26, 2025 06:49
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants