[UR][L0v2] Add graph support for batched queue#21324
[UR][L0v2] Add graph support for batched queue#21324KFilipek wants to merge 4 commits intointel:syclfrom
Conversation
560edb4 to
7c9779b
Compare
09eaf7d to
2200aa5
Compare
2200aa5 to
3f6298f
Compare
| .borrow(hDevice->Id.value(), eventFlags); | ||
| } | ||
|
|
||
| ur_event_handle_t ur_queue_batched_t::createEventIfRequestedRegular( |
There was a problem hiding this comment.
It should no longer be createEventIfRequestedRegular since there's an event_pool param that decides the origin of created event.
| ur_event_handle_t ur_queue_batched_t::createEventAndRetainRegular( | ||
| ur_event_handle_t *phEvent, ur_event_generation_t batch_generation) { | ||
| auto hEvent = eventPoolRegular->allocate(); | ||
| event_pool *pool, ur_event_handle_t *phEvent, |
|
|
||
| ur_result_t queueBeginGraphCapteExp() override { | ||
| return UR_RESULT_ERROR_UNSUPPORTED_FEATURE; | ||
| ur_result_t enqueueGraphExp(ur_exp_executable_graph_handle_t hGraph, |
There was a problem hiding this comment.
I think only helper methods are defined here. Implementations of the enqueue operation for batched queue are stored in the queue_batched.cpp.
|
|
||
| markIssuedCommandInBatch(currentRegular); | ||
|
|
||
| UR_CALL(currentRegular->getActiveBatch().appendKernelLaunch( |
There was a problem hiding this comment.
getActiveBatch is no longed used anywhere.
| @@ -189,14 +195,19 @@ | |||
|
|
|||
| TRACK_SCOPE_LATENCY("ur_queue_batched_t::enqueueKernelLaunch"); | |||
| auto currentRegular = currentCmdLists.lock(); | |||
There was a problem hiding this comment.
lockedBatch is more fitting than currentRegular.
| TRACK_SCOPE_LATENCY("ur_queue_batched_t::enqueueKernelLaunch"); | ||
| auto currentRegular = currentCmdLists.lock(); | ||
| auto &eventPool = | ||
| currentRegular->isGraphCapture() ? eventPoolImmediate : eventPoolRegular; |
There was a problem hiding this comment.
You might also move the event pool choosing logic to a helper method getEventPool or something, like you did for getListManager. Perhaps there could also be getCurrentGeneration helper method in batched queue.
There was a problem hiding this comment.
I would suggest a helper method for generating events, since the event pool is needed only to create an event to be passed as an argument for command list manager functions.
Such a helper could look like this:
ur_event_t ur_queue_batched_t::getEvent(ur_event_handle_t *phEvent, ur_event_generation_t batch_generation) {
if (isGraphCapture()) {
return createEventIfRequestedRegular(phEvent, batch_generation);
} else {
return createEventIfRequested(eventPoolImmediate.get(), phEvent, this);
}
Where createEventIfRequested(...) is a default implementation from event_pool.hpp . There should also be a version with retaining events.
This way, enqueueing operations is simplified to:
UR_CALL(currentRegular->getListManager().appendKernelLaunch(
hKernel, workDim, pGlobalWorkOffset, pGlobalWorkSize, pLocalWorkSize,
launchPropList, waitListView,
getEvent(phEvent, currentRegular->getCurrentGeneration())));
| // capture. If the graph capture is active, the immediate command list manager | ||
| // handle is returned, otherwise the regular command list manager handle is | ||
| // returned. | ||
| ur_command_list_manager &getListManager() { |
There was a problem hiding this comment.
I would add a comment on why we need an immediate list when the graph capture is active
| TRACK_SCOPE_LATENCY("ur_queue_batched_t::enqueueKernelLaunch"); | ||
| auto currentRegular = currentCmdLists.lock(); | ||
| auto &eventPool = | ||
| currentRegular->isGraphCapture() ? eventPoolImmediate : eventPoolRegular; |
There was a problem hiding this comment.
I would suggest a helper method for generating events, since the event pool is needed only to create an event to be passed as an argument for command list manager functions.
Such a helper could look like this:
ur_event_t ur_queue_batched_t::getEvent(ur_event_handle_t *phEvent, ur_event_generation_t batch_generation) {
if (isGraphCapture()) {
return createEventIfRequestedRegular(phEvent, batch_generation);
} else {
return createEventIfRequested(eventPoolImmediate.get(), phEvent, this);
}
Where createEventIfRequested(...) is a default implementation from event_pool.hpp . There should also be a version with retaining events.
This way, enqueueing operations is simplified to:
UR_CALL(currentRegular->getListManager().appendKernelLaunch(
hKernel, workDim, pGlobalWorkOffset, pGlobalWorkSize, pLocalWorkSize,
launchPropList, waitListView,
getEvent(phEvent, currentRegular->getCurrentGeneration())));
| ur_event_handle_t | ||
| createEventAndRetainRegular(ur_event_handle_t *phEvent, | ||
| ur_event_generation_t batch_generation); | ||
| ur_event_handle_t createEventAndRetainRegular( |
There was a problem hiding this comment.
After this patch, this function is not used anywhere, so it could be removed.
| return maxNumberOfEnqueuedOperations <= enqueuedOperationsCounter; | ||
| } | ||
|
|
||
| bool isGraphCapture() const { return isGraphCaptureActive; } |
There was a problem hiding this comment.
Maybe isGraphCaptureActive()?
| batchLocked->isGraphCapture() ? eventPoolImmediate : eventPoolRegular; | ||
| return batchLocked->getListManager().appendGraph( | ||
| hGraph, waitListView, | ||
| createEventIfRequestedRegular( |
There was a problem hiding this comment.
Does it work when graph capturing is not active? Is it possible to enqueue a graph on a regular command list?
There was a problem hiding this comment.
Is it possible to enqueue a graph on a regular command list?
Not sure, let's test it. What definitely doesn't work, is enqueuing a command buffer (i.e., another command list) on a regular command list.
1d21010 to
66e360e
Compare
This PR adds support for graph capture and execution in the Level Zero v2 batched queue implementation.
Changes:
capture state