Welcome to the Pupil Bio Data Analysis repository! This repository contains code and scripts used for the bioinformatics challenge related to identifying somatic mutations and performing quality control for cancer sample analysis.
This task involves calculating coverage metrics and identifying biomarkers for tissue differentiation.
This task covers quality control checks, alignment to the human genome, and identifying somatic mutations using custom scripts.
The scripts folder contains Python and R scripts that automate the following tasks:
- π§ͺ Quality control analysis using FastQC and other tools.
- 𧬠Alignment of sequencing data using tools such as Bowtie2 and BWA.
- π¬ Mutation calling and somatic mutation identification using custom Python scripts leveraging Samtools and bcftools.
- π Plots and visualizations for quality control and biomarker identification.
The scripts are organized into separate folders for each task, allowing users to easily navigate through the pipeline.
To use the scripts, follow these steps:
- Clone the repository:
git clone https://github.com/Harshith-Reddy-CK/pupil_bio_data_analysis.git cd pupil_bio_data_analysis
πΎ Supplementary and Output Files The repository currently contains scripts for performing the tasks outlined above. Due to the size limitations of GitHub, large files, including supplementary data and output files, are not hosted directly on this repository.
To request the large files (e.g., BAM files, VCF files, output data), you can contact the repository owner or download them from the provided Dropbox or Google Drive links below:
π₯ Download Links Please reach out to me if you would like access to these files.