Fix deadlock condition by disabling prefetches on invalid hits #36
Open
colluca wants to merge 2 commits into
Open
Fix deadlock condition by disabling prefetches on invalid hits #36colluca wants to merge 2 commits into
colluca wants to merge 2 commits into
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Consider the following scenario, encountered in the Snitch cluster.
All cores reach a barrier except core 0. One would expect all instruction frontends to be idle, except core 0’s, but this is not the case. The L0 cache produces prefetching requests for any address on its request interface, even if valid is low. For some condition I did not have time to fully investigate, but related to a multi-hit scenario, the prefetched cache lines would be flushed immediately after being written back into the cache. Thus, even if the address remains stable on the request interface, the associated prefetching request would not be filtered by the availability of that cache line in the cache, and the same cache line would be repeatedly prefetched over and over again, without end. This causes the L1 cache to be continuously bombarded with prefetching requests from the L0 caches.
Simultaneously, core 0 is doing some useful work, but misses in the L0 cache, causing a refill to be requested from the L1 cache. As prefetch requests have fixed priority over miss requests at the L1 cache, and the other 8 cores are bombarding the L1 cache with prefetching requests, the miss from core 0 never gets served, stalling core 0. As all other cores are waiting on a barrier for core 0, the cycle is closed, and the system deadlocks.
There are several things that one could consider faulty in this whole operation, and which one could fix to solve this specific deadlock condition. For simplicity, I chose to correct the prefetch request filtering logic. I don't believe it's good design to process invalid requests, and the expected behaviour would be that the L0 is idle when the core frontend is idle. It, perhaps accidentally, solves the deadlock since, when the other cores reach the barrier and their frontends become idle, core 0 can resume its execution. A more solid fix would probably also tackle another issue the arbitration priority at the L1 cache, but I don't personally have time to look into this properly, but I opened an issue to track this #37.
This PR also cleans up an outdated comment after removing non-resettable FFs.