Request of details on JUMP-CP preprocessing

Hi! Thank you for sharing this excellent model and open-sourcing the RxRx dataset and associated resources. I found the paper and Huggingface documentation very insightful. While reviewing these materials, I noticed that the open-source model was trained using JUMP-CP data alongside the RxRx datasets, with random image crops employed during training.

**Given that JUMP-CP images originate from a different institute and imaging system compared to RxRx, could you please provide more details on how the JUMP-CP data was processed before training?** For example, is it corrected for variations in background intensity, thresholded, and/or others? After that, how was it processed to 256*256 while maintaining similar magnification -- was it downsampled or just randomly cropped from the original resolution? Understanding this would be incredibly valuable for my research, as I aim to use your model for inference on another Cell Painting dataset that may exhibit even greater differences.

Additionally, if possible, **could you share a Python script or workflow for mapping raw CellPainting images** (e.g., TIFF files such as [these ones](https://open.quiltdata.com/b/cellpainting-gallery/tree/cpg0016-jump/source_1/images/Batch1_20221004/images/UL000109__2022-10-05T06_35_06-Measurement1/Images/) in JUMP-CP) **into the input for the model**? This would greatly facilitate the application of your model to similar datasets. 

Thank you for your time and consideration. I appreciate your efforts in making your work accessible!

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Request of details on JUMP-CP preprocessing #25

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Request of details on JUMP-CP preprocessing #25

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions