-
Notifications
You must be signed in to change notification settings - Fork 15
Description
Hi! Thank you for sharing this excellent model and open-sourcing the RxRx dataset and associated resources. I found the paper and Huggingface documentation very insightful. While reviewing these materials, I noticed that the open-source model was trained using JUMP-CP data alongside the RxRx datasets, with random image crops employed during training.
Given that JUMP-CP images originate from a different institute and imaging system compared to RxRx, could you please provide more details on how the JUMP-CP data was processed before training? For example, is it corrected for variations in background intensity, thresholded, and/or others? After that, how was it processed to 256*256 while maintaining similar magnification -- was it downsampled or just randomly cropped from the original resolution? Understanding this would be incredibly valuable for my research, as I aim to use your model for inference on another Cell Painting dataset that may exhibit even greater differences.
Additionally, if possible, could you share a Python script or workflow for mapping raw CellPainting images (e.g., TIFF files such as these ones in JUMP-CP) into the input for the model? This would greatly facilitate the application of your model to similar datasets.
Thank you for your time and consideration. I appreciate your efforts in making your work accessible!