Program useful to run multiple correlation tests between a main data, in this case, RNAseq data for HeLa cell line, and various variables present in a data frame, in this case human tissues from GTEx data, all acquired from Atlas Expression.
Set working directory:
Keep every .txt files in the same folder as the programms ('Auxiliar_funcions.R' and 'Projeto_HeLa_vs_Tecidos.R') then set the folder as the 'working directory' in the RStudio 'New Project' option.
R language version used: 4.1.3
Libraries used:
- org.Hs.eg.db
- devtools
- easyGgplot2 or ggplot2
Packages needed - install in console:
if (!requireNamespace("BiocManager", quietly = TRUE))
install.packages("BiocManager")
BiocManager::install("org.Hs.eg.db")
install.packages("devtools")
library(devtools)
install_github("kassambara/easyGgplot2")
📝 NOTE: You might need RTools (version dependent of R language version) to install packages from sources like github
The following code installs a package that includes the library ggplot2 (without github):
install.packages("tidyverse")
Raw data acquisition and description:
| Data base | Description | Document to source |
|---|---|---|
| Apid | Physical interactions protein-protein, human interactome i | 9606_noISI_Q2.txt |
| Omnipath | Gene regulation and signaling network | omnipathdb.txt |
| Dorothea | Transcription factors and targets annotations | dorothea_AB.txt |
| Atlas Expression | Protein expression - proteomics and transcriptomics |
i Data verified with 2 or more experimental evidences
Pedro Fanica
Faculdade de Ciências da Universidade de Lisboa - Bioquímica Experimental IV, Licenciatura em Bioquímica