Skip to content

Normalization for CCLE #3

@JudithBernett

Description

@JudithBernett

I think the normalisation for CCLE is wrong.

Original code:

# transfroms [0, -100] to [1,0] for curvecurator processing
def transform_activity_range(raw_df):
    raw_df["Activity Data (median)"] = raw_df["Activity Data (median)"] / 100 + 1
    raw_df["Amax"] = raw_df["Amax"] / 100 + 1

There is no explicit control column in the raw data, but there is this Amax value, which seems to be unique per cell line/drug combination. In the original supplementary methods, the authors state that

Amax is the maximal activity value reached within a model

In the raw data, Amax is kind of the maximum response; it is always close to the response value associated with the highest dose (8µM). So it seems to be a positive control rather than a negative control, like DMSO would be.

I think it should be

raw_df["response"] = 1 - raw_df["Activity Data (median)"]/raw_df["Amax"]

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions