-
Notifications
You must be signed in to change notification settings - Fork 17
Fix crash while cancelling all task queues. #54
base: master
Are you sure you want to change the base?
Conversation
It is possible for a GUPnP service proxy action callback to run into a
dangling pointer to an already freed task queue after a call of
dleyna_task_processor_set_quitting(). The root cause is a bad use of
g_hash_table_remove() inside a callback of
g_hash_table_foreach_remove() while trying to cancel and remove all
task queues from a task processor:
dleyna_task_processor_set_quitting()
prv_cancel_all_queues()
g_hash_table_foreach_remove()
prv_cancel_cb()
prv_cancel()
prv_cancel_only()
dleyna_service_task_cancel_cb() [via task_queue->task_cancel_cb()]
dleyna_task_queue_task_completed()
g_hash_table_remove() [DLEYNA_TASK_QUEUE_FLAG_AUTO_REMOVE]
The bad g_hash_table_remove() call results in
g_hash_table_foreach_remove() to return early, leaving some task
queues uncancelled. The affected hash table is completely removed in
dleyna_task_processor_set_quitting(), but there might still be some
pending GUPnP actions around which have previously been started with
gupnp_service_proxy_begin_action(). Since task queue cancellation was
incomplete, these actions will not have been cancelled, and their
associated callbacks are called on completion. Callback
dleyna_service_task_begin_action_cb() dereferences its user_data
pointer which points to a nonexistent task structure, leading to a
crash.
To reproduce, start dleyna-server-service and quickly call
com.intel.dLeynaServer.Manager.GetServers() while
dleyna-server-service is querying the UPnP servers on the network.
$ G_SLICE=always-malloc G_DEBUG=gc-friendly valgrind \
Install/libexec/dleyna-server-service
$ busctl --user call com.intel.dleyna-server /com/intel/dLeynaServer \
com.intel.dLeynaServer.Manager GetServers
This commit avoids the call of g_hash_table_remove() and returns the
appropriate return value to g_hash_table_foreach_remove() to have it
remove the hash table entries for us.
debarshiray
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for investigating this at such length! I was also looking at it from the perspective of some dleyna-renderer crashes.
Are you saying that returning TRUE from the g_hash_table_foreach_remove callback will leave some key-value pairs in the hash table untouched by the callback? Reading the implementation in GLib, it seems to me that no matter what the callback returns, it will be invoked for all key-value pairs in the hash table. Those pairs for which the callback returns TRUE will be removed.
| remove_queue = task_queue->flags & | ||
| DLEYNA_TASK_QUEUE_FLAG_AUTO_REMOVE; | ||
|
|
||
| remove_queue = task_queue->flags & DLEYNA_TASK_QUEUE_FLAG_AUTO_REMOVE; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This isn't actually preventing the g_hash_table_remove, right? Because the callback can still return TRUE.
I think the original code here was OK, because if we remove the queue for the running task then ...
| DLEYNA_LOG_DEBUG("Enter - Task completed for queue <%s,%s>", | ||
| queue_id->source, queue_id->sink); | ||
|
|
||
| queue = g_hash_table_lookup(processor->task_queues, queue_id); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
... queue would be NULL here.
|
I submitted phako#1 with what I think is the right fix. I am still testing it, so not 100% sure yet. |
It is possible for a GUPnP service proxy action callback to run into a
dangling pointer to an already freed task queue after a call of
dleyna_task_processor_set_quitting(). The root cause is a bad use of
g_hash_table_remove() inside a callback of
g_hash_table_foreach_remove() while trying to cancel and remove all
task queues from a task processor:
The bad g_hash_table_remove() call results in
g_hash_table_foreach_remove() to return early, leaving some task
queues uncancelled. The affected hash table is completely removed in
dleyna_task_processor_set_quitting(), but there might still be some
pending GUPnP actions around which have previously been started with
gupnp_service_proxy_begin_action(). Since task queue cancellation was
incomplete, these actions will not have been cancelled, and their
associated callbacks are called on completion. Callback
dleyna_service_task_begin_action_cb() dereferences its user_data
pointer which points to a nonexistent task structure, leading to a
crash.
To reproduce, start dleyna-server-service and quickly call
com.intel.dLeynaServer.Manager.GetServers() while
dleyna-server-service is querying the UPnP servers on the network.
$ G_SLICE=always-malloc G_DEBUG=gc-friendly valgrind
Install/libexec/dleyna-server-service
$ busctl --user call com.intel.dleyna-server /com/intel/dLeynaServer
com.intel.dLeynaServer.Manager GetServers
This commit avoids the call of g_hash_table_remove() and returns the
appropriate return value to g_hash_table_foreach_remove() to have it
remove the hash table entries for us.