Skip to content
This repository was archived by the owner on Jul 19, 2018. It is now read-only.

Conversation

@tobine
Copy link
Contributor

@tobine tobine commented Aug 24, 2017

This is building on and replacing #2005
This includes further commits to handle inter-CB memory access conflicts.

tobine added 26 commits August 23, 2017 16:37
This is the initial version of a single-case update to handle
synchronization tracking for the cmd sequence from GH892. As the code
stands it's only primarily useful for this single cmd sequence and
isn't catching any variety of memory conflicts.

The intention is that if the general idea is sound then I will build on
this initial code to expand/generalize the algorithms used here for all
memory conflict cases.
This is a simple test positive case based on GH892. For this initial
test the pipeline barrier is correctly in place so no errors should
be signalled. This is the basic case that will be used to generate
some more test cases, both positive and negative.
This is a negative test featuring a RaW dependency with no synch object
in-between to regulate the read.
Added a look-up-table for fixed pipe stage and access masks per cmd.
Added some functions to generalize adding/validating memory accesses
per command and migrated the checks into buffer_validation files.

Added CmdCopyImage tracking. Initially treating all image accesses
conservatively and assuming they touch full size of image memory.
To deal with this and further-such issues that will arise I updated
MemoryAccess to have a "precise" bool that indicates when Access
details are exact. Only when both Accesses are precise can we issue an
error, otherwise just issue a warning.
Integrate CmdCopyBuffer into MemoryAccess recording and validation
framework.

Refactored recording of read/write memory accesses to use the binding
which saves some repeated lines of code.
Update synchronization modeling to handle barrier dependency chains.
Initially only handling CmdPipelineBarriers.

Create separate synch_commands vector in cmd buffer for quick parsing
of just synch commands.
For any previous synch commands whose destination synch scope overlaps
the current synch command's source synch scope, merge the 1st cmds
dest mask into current commands src mask and pull previous barriers
into the current synch command.

Updated existing PipelineBarrier algorithm for global and buffer
barriers to make sure that pipeline masks overlap and only check
access masks in that case
Adding positive test UpdateBufferWaRDependencyWithTwoBarrierChain. This
features a WaR dependency that's cleared by a chain of two pipeline
barriers. This confirms that validation correctly supports the merging
of overlapping barriers to correctly account for synchronization chains.
Add tracking for the memory accesses in CmdBlitImage.
Initially just adding a read & write access for the src & dest image
memory that conservatively references whole binding and is recorded
as imprecise so that it only will cause warnings.
Add tracking for the memory access in CmdFillBuffer.
This is a single write to a buffer for which we have precise info.
Add tracking for the memory accesses in CmdClear[DS|Color]Image.
Initially just adding imprecise write access covering the whole image
binding area.
Update CmdClearAttachments to record a write memory access per image
attachment that's cleared. This is an imprecise access across the
entire image binding for now.
Update global mem access state when CmdClearAttachments is called.
Add tracking for the memory accesses in CmdResolveImage.
Initially just adding a read & write access for the src & dest image
memory that conservatively references whole binding and is recorded
as imprecise so that it only will cause warnings.
Just moving code to prep for adding mem access support to
CmdCopyQueryPoolResults. Refactor it to use the common PreValidate* &
PostRecord* pattern.
Add the buffer write access for CmdCopyQueryPoolResults. Updated query
pool data struct to store number of elements in the query based on
query type and then use that number when calculating total data size.
Update the Draw/Dispatch validate/update framework to account for all
memory accesses through descriptors. At validate time grab the complete
set of active read and write memory bindings from descriptors.
Then verify the memory accesses to flag any conflicts.
During state update, record all of the accesses into the cmd buffer
state.

All of these accesses are imprecise and will only result in warnings if
there is a potential synch issue.
A PipelineBarrier command that occurs within a renderPass is only for
handling a subpass self-dependency. That case is not currently included
in the synch model so updating code to only record barriers that are
outside of a renderPass for now.
Moved existing code into RecordBarrierMemoryAccess() function. This
generalizes to code so it can more easily be multi-purposed in order to
record barriers for CmdWaitEvents().
This is a race condition test to check both a RaW and WaR conflict with
buffers used in a Draw. A src buffer is copied to a dst buffer, then,
without a barrier, the src is used as a storage buffer in a draw and
the dst is used as a uniform buffer in the same draw. This results in a
RaW conflict for the Dst/Uniform buffer and WaR conflict for the
Src/Storage buffer.
Make any memory access race condition conflict messages warning-level
for initial release. There are still some holes in modeling the synch
primitives in validation so setting callback to warning allows for a
soft release where people can ignore or post bugs on incorrect warnings
while the synch model is updated in validation.

The current warning message is meant to deter developers from sinking
too much time into debugging these warnings, and promotes feedback for
known-bad warning cases.
Record barriers in CmdWaitEvents using same method used for barriers
from CmdPipelineBarriers. This involves merging any previous barriers
to handle dependency chains and then recording the barriers into any
previous memory accesses that fall into the source synch scope.
Modify ValidateMemoryAccesses() so that it takes map of prev accesses
directly. This is preparation to share the function with command buffer
submission validation where we'll deal with sets of initial accesses
that may cross many command buffers.
At QueueSubmit time, check for any memory conflicts between command
buffers in a single submit.

Here's the high-level algorithm:
1. While submitting CBs, store vector of CB state for each CB
2. For each CB, create vector of mem accesses that are "live" meaning
that no barrier within the CB made them visible
3. Check live vector against all other live mem accesses up to this
point in submit.
3a. If there are no potential conflicts, save live access vector
into submit access map and continue with next submit
3b. If there are potential conflicts, we need to replay commands up to
this point so continue to step 4.
4. Gather all cmds submitted up to this point in a single vector
5. Note any synch commands that occur between the early and late mem
access conflict commands
6. Replay mem access commands, including synch commands between the
points of interest
7. Following replay, re-check for conflicts and if they still occur,
flag an error

Also update the race condition warning message between two separate
command buffers with additional information about second cmd buffer.
Added TwoCommandBufferUpdateBufferRaWDependencyMissingBarrier test that
is an alternate version of UpdateBufferRaWDependencyMissingBarrier test
that separates the CmdUpdateBuffer and CmdDrawIndirect commands across
two separate command buffers and verifies that the RaW race condition
is flagged.
Merging synch barriers may be a bit naive. Note potential bug for
future fix.
Add MultiCBDrawWithBufferWaRandRaWSynchChain test. This features three
command buffers with mem access dependencies and synch commands that
cross CB boundaries. This is a positive test to verify that no errors
occur.
tobine added 2 commits August 25, 2017 16:30
Mis-match between size_t and uint32_t was causing some Windows compiler
warnings. Use size_t uniformly where vector size() function involved.
Anytime we need to replay cmds during queue submit, we should start
with an empty replay cmd vector, since replay always starts from the
beginning of a submission.
@chrisforbes
Copy link
Collaborator

As mentioned offline, this has brutal perf impact. Frame times in smoketest --validate -s are more than 3x higher with this series compared to master.

@tobine
Copy link
Contributor Author

tobine commented Sep 4, 2017

Replaced by #2049

@tobine tobine closed this Sep 4, 2017
@tobine tobine deleted the tobin_synch_event_model branch November 30, 2017 17:10
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants