Skip to content

Conversation

@tmleman
Copy link
Contributor

@tmleman tmleman commented Sep 24, 2025

This PR fixes a NULL pointer dereference crash in the IPC4 copier module that occurs when sink buffer IDs map to non-zero sink_queue_id values but corresponding converters are not initialized.

Problem Description

The crash occurs in do_conversion_copy() when accessing cd->converter[i] where i is extracted from sink buffer IDs using IPC4_SINK_QUEUE_ID(buf_get_id(sink)).

Root Cause

  1. Buffer ID Mapping: IPC4_SINK_QUEUE_ID() can extract non-zero queue IDs from buffer IDs, even in single-sink scenarios
  2. Limited Initialization: copier_prepare() only initializes converter[0] for pin 0
  3. Runtime Mismatch: Accessing converter[1] when only converter[0] exists causes NULL pointer dereference

Crash Conditions

The crash occurs when all conditions are met:

  • Copier component has endpoint_num == 0 (module copier, not gateway copier)
  • Connected sink buffer's ID maps to non-zero sink_queue_id via IPC4_SINK_QUEUE_ID()
  • Only converter[0] was initialized in copier_prepare()
  • Audio processing attempts to use converter[sink_queue_id] where sink_queue_id != 0

Example: Buffer ID 0x10000 extracts to sink_queue_id=1, but only converter[0] exists.

Current Implementation

copier_prepare() only initializes converter[0]:

if (!cd->endpoint_num) {
    cd->converter[0] = get_converter_func(&cd->config.base.audio_fmt,
                          &cd->config.out_fmt, ipc4_gtw_none,
                          ipc4_bidirection, DUMMY_CHMAP);
}

This assumes all sink buffers map to sink_queue_id=0, which is incorrect when buffer IDs encode different queue information.

Test Pipeline Description

The crash occurs in RTC AEC topologies with internal module copiers. Example topology:

flowchart TD

    subgraph Pipeline0
        direction LR
        style Pipeline0 fill:#daf5e5,stroke:#3cb371,stroke-width:2px
        copier0["COPIER <br/> Module Id: 0x3 <br/> Instance Id: 0x0 <br/> HOST -> DSP <br/> 4/16(16)/48000"] --> aec["RTC AEC <br/> Module Id: 0x2000 <br/> Instance Id: 0x0"]
        aec --> copier1["COPIER <br/> Module Id: 0x3 <br/> Instance Id: 0x1 <br/> HOST <- DSP <br/> 4/16(16)/48000"]
    end

    subgraph Pipeline1
        direction LR
        style Pipeline1 fill:#daf5e5,stroke:#3cb371,stroke-width:2px
        copier2["COPIER <br/> Module Id: 0x3 <br/> Instance Id: 0x3 <br/> HOST -> DSP <br/> 2/16(16)/48000"] --> copier3["COPIER <br/> Module Id: 0x3 <br/> Instance Id: 0x2"]
        copier3 --> aec
    end
Loading

The crash occurs in copier3 (Instance 0x2), an internal module copier (endpoint_num == 0) whose sink buffer ID maps to non-zero sink_queue_id, but the corresponding converter was never initialized.

Copilot AI review requested due to automatic review settings September 24, 2025 14:20
Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR fixes a NULL pointer dereference crash in the IPC4 copier module that occurs when sink buffer IDs map to non-zero sink_queue_id values but the corresponding converters are not initialized.

Key Changes

  • Adds initialization logic for converters based on actual sink buffer queue IDs
  • Iterates through all connected sink buffers to determine required converter array indices
  • Validates sink_queue_id values are within the expected range

Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.

@serhiy-katsyuba-intel
Copy link
Contributor

Hm, that's does not look right. Converters for output pins 1 to 3 are setup in COPIER_MODULE_CFG_PARAM_SET_SINK_FORMAT handler. The problem here seems is caused by the only output pin has ID 1 instead of 0.

This looks more like either a test script problem (wrong pin number in bind operation); or maybe using IPC4_SINK_QUEUE_ID(buf_get_id(sink)) (this gets pin index from buffer id?) is a bad idea and can get wrong pin indices.

@tmleman
Copy link
Contributor Author

tmleman commented Sep 25, 2025

Hm, that's does not look right. Converters for output pins 1 to 3 are setup in COPIER_MODULE_CFG_PARAM_SET_SINK_FORMAT handler. The problem here seems is caused by the only output pin has ID 1 instead of 0.

This looks more like either a test script problem (wrong pin number in bind operation); or maybe using IPC4_SINK_QUEUE_ID(buf_get_id(sink)) (this gets pin index from buffer id?) is a bad idea and can get wrong pin indices.

I considered the absence of COPIER_MODULE_CFG_PARAM_SET_SINK_FORMAT as a problem in the test but crossed it off the list since the copier where the issue occurred has only one sink, and this message should be sent when there are more than one.

The problematic aspect here seems to be the sink ID (1). When the copier is bound to AEC, in IPC we get source queue id 0 and sink queue id 1.

@serhiy-katsyuba-intel
Copy link
Contributor

Maybe the problem here is that instead of IPC4_SINK_QUEUE_ID(), IPC4_SRC_QUEUE_ID() should be used? The buffer is connected to sink of the copier, so from buffer's point of view, copier's sink is source and IPC4_SRC_QUEUE_ID() should be used to extract copier sink index.

@tmleman
Copy link
Contributor Author

tmleman commented Sep 25, 2025

--- a/src/audio/copier/copier.c
+++ b/src/audio/copier/copier.c
@@ -540,7 +540,7 @@ static int do_conversion_copy(struct comp_dev *dev,

        comp_get_copy_limits(src, sink, processed_data);

-       i = IPC4_SINK_QUEUE_ID(buf_get_id(sink));
+       i = IPC4_SRC_QUEUE_ID(buf_get_id(sink));
        if (i >= IPC4_COPIER_MODULE_OUTPUT_PINS_COUNT)
                return -EINVAL;
        buffer_stream_invalidate(src, processed_data->source_bytes);
@@ -617,7 +617,7 @@ static int copier_module_copy(struct processing_module *mod,
                        uint32_t source_samples;
                        int sink_queue_id;

-                       sink_queue_id = IPC4_SINK_QUEUE_ID(buf_get_id(sink_c));
+                       sink_queue_id = IPC4_SRC_QUEUE_ID(buf_get_id(sink_c));
                        if (sink_queue_id >= IPC4_COPIER_MODULE_OUTPUT_PINS_COUNT)
                                return -EINVAL;

I tested it and it works, but maybe due to the naming, it looks like a hack.

Although upon closer inspection, it makes sense. We are looking for the source ID for a module that is a sink from our copier perspective.

@serhiy-katsyuba-intel
Copy link
Contributor

Looked into code, yes, the problem is caused by using IPC4_SINK_QUEUE_ID(buf_get_id(sink)) in do_conversion_copy() instead of IPC4_SRC_QUEUE_ID(). Here is how pin indices are put into buffer id:

#define IPC4_COMP_ID(x, y)	((y) << 16 | (x))
#define IPC4_SRC_QUEUE_ID(x)	((x) & 0xffff)
#define IPC4_SINK_QUEUE_ID(x)	(((x) >> 16) & 0xffff)

// this is how buffer id is constructed
ipc_buf.comp.id = IPC4_COMP_ID(src_queue, dst_queue);

So do_conversion_copy() should use IPC4_SRC_QUEUE_ID() to get copier's sink index.

@tmleman tmleman force-pushed the topic/upstream/pr/copier/fix/null_converter branch from b272bf3 to 25bbf5c Compare September 25, 2025 11:57
Fix NULL pointer dereference crash in copier module by using the correct
IPC4 macro to extract queue IDs from buffer IDs.

The issue occurred in do_conversion_copy() and copier_module_copy() when
accessing cd->converter[i] where i was extracted using
IPC4_SINK_QUEUE_ID().

This was incorrect because buffer IDs are constructed as:
IPC4_COMP_ID(src_queue, dst_queue)

From the buffer's perspective, the copier's sink is actually the source,
so IPC4_SRC_QUEUE_ID() should be used to get the correct copier sink
index.

Using IPC4_SINK_QUEUE_ID() extracted the dst_queue (upper 16 bits)
instead of src_queue (lower 16 bits), leading to wrong array indices and
NULL pointer crashes when the converter array wasn't initialized for
those indices.

This resolves crashes in RTC AEC topologies where internal module
copiers have buffer IDs that map to non-zero queue IDs.

Signed-off-by: Tomasz Leman <tomasz.m.leman@intel.com>
@abonislawski abonislawski merged commit b6ff255 into thesofproject:main Oct 6, 2025
34 of 45 checks passed
@lgirdwood
Copy link
Member

@tmleman @kv2019i @abonislawski I will cherry pick this fix for v2.14. Thanks

serhiy-katsyuba-intel added a commit to serhiy-katsyuba-intel/sof that referenced this pull request Oct 13, 2025
This is a follow-up to thesofproject#10257.

This fixes all incorrect uses of IPC4_SRC_QUEUE_ID() and
IPC4_SINK_QUEUE_ID().

Signed-off-by: Serhiy Katsyuba <serhiy.katsyuba@intel.com>
serhiy-katsyuba-intel added a commit to serhiy-katsyuba-intel/sof that referenced this pull request Oct 13, 2025
This is a follow-up to thesofproject#10257.

This fixes all occurrences of incorrect use of IPC4_SRC_QUEUE_ID() and
IPC4_SINK_QUEUE_ID().

IPC4_SRC_QUEUE_ID() expects buffer as a parameter and returns the queue ID
of the module that produces data for that buffer. So from the buffer's
point of view, such a module and queue is the source (hence the name
IPC4_SRC_QUEUE_ID()), however, from the module's point of view, such
a queue is a sink. That, unfortunately, makes it easy to make a mistake
when deciding whether IPC4_SRC_QUEUE_ID() vs. IPC4_SINK_QUEUE_ID() should
be used.

Signed-off-by: Serhiy Katsyuba <serhiy.katsyuba@intel.com>
lgirdwood pushed a commit that referenced this pull request Oct 22, 2025
This is a follow-up to #10257.

This fixes all occurrences of incorrect use of IPC4_SRC_QUEUE_ID() and
IPC4_SINK_QUEUE_ID().

IPC4_SRC_QUEUE_ID() expects buffer as a parameter and returns the queue ID
of the module that produces data for that buffer. So from the buffer's
point of view, such a module and queue is the source (hence the name
IPC4_SRC_QUEUE_ID()), however, from the module's point of view, such
a queue is a sink. That, unfortunately, makes it easy to make a mistake
when deciding whether IPC4_SRC_QUEUE_ID() vs. IPC4_SINK_QUEUE_ID() should
be used.

Signed-off-by: Serhiy Katsyuba <serhiy.katsyuba@intel.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants